Emulation, ±4 W/m² Long Wave Cloud Forcing Error, and Meaning

Guest post by Pat Frank

My September 7 post describing the recent paper published in Frontiers in Earth Science on GCM physical error analysis attracted a lot of attention, consisting of both support and criticism.

Among other things, the paper showed that the air temperature projections of advanced GCMs are just linear extrapolations of fractional greenhouse gas (GHG) forcing.

Emulation

The paper presented a GCM emulation equation expressing this linear relationship, along with extensive demonstrations of its unvarying success.

In the paper, GCMs are treated as a black box. GHG forcing goes in, air temperature projections come out. These observables are the points at issue. What happens inside the black box is irrelevant.

In the emulation equation of the paper, GHG forcing goes in and successfully emulated GCM air temperature projections come out. Just as they do in GCMs. In every case, GCM and emulation, air temperature is a linear extrapolation of GHG forcing.

Nick Stokes’ recent post proposed that, “Given a solution f(t) of a GCM, you can actually emulate it perfectly with a huge variety of DEs [differential equations].” This, he supposed, is a criticism of the linear emulation equation in the paper.

However, in every single one of those DEs, GHG forcing would have to go in, and a linear extrapolation of fractional GHG forcing would have to come out. If the DE did not behave linearly the air temperature emulation would be unsuccessful.

It would not matter what differential loop-de-loops occurred in Nick’s DEs between the inputs and the outputs. The DE outputs must necessarily be a linear extrapolation of the inputs. Were they not, the emulations would fail.

That necessary linearity means that Nick Stokes’ entire huge variety of DEs would merely be a set of unnecessarily complex examples validating the linear emulation equation in my paper.

Nick’s DEs would just be linear emulators with extraneous differential gargoyles; inessential decorations stuck on for artistic, or in his case polemical, reasons.

Nick Stokes’ DEs are just more complicated ways of demonstrating the same insight as is in the paper: that GCM air temperature projections are merely linear extrapolations of fractional GHG forcing.

His DEs add nothing to our understanding. Nor would they disprove the power of the original linear emulation equation.

The emulator equation takes the same physical variables as GCMs, engages them in the same physically relevant way, and produces the same expectation values. Its behavior duplicates all the important observable qualities of any given GCM.

The emulation equation displays the same sensitivity to forcing inputs as the GCMS. It therefore displays the same sensitivity to the physical uncertainty associated with those very same forcings.

Emulator and GCM identity of sensitivity to inputs means that the emulator will necessarily reveal the reliability of GCM outputs, when using the emulator to propagate input uncertainty.

In short, the successful emulator can be used to predict how the GCM behaves; something directly indicated by the identity of sensitivity to inputs. They are both, emulator and GCM, linear extrapolation machines.

Again, the emulation equation outputs display the same sensitivity to forcing inputs as the GCMs. It therefore has the same sensitivity as the GCMs to the uncertainty associated with those very same forcings.

Propagation of Non-normal Systematic Error

I posted a long extract from relevant literature on the meaning and method of error propagation, here. Most of the papers are from engineering journals.

This is not unexpected given the extremely critical attention engineers must pay to accuracy. Their work products have to perform effectively under the constraints of safety and economic survival.

However, special notice is given to the paper of Vasquez and Whiting, who examine error analysis for complex non-linear models.

An extended quote is worthwhile:

… systematic errors are associated with calibration bias in [methods] and equipment… Experimentalists have paid significant attention to the effect of random errors on uncertainty propagation in chemical and physical property estimation. However, even though the concept of systematic error is clear, there is a surprising paucity of methodologies to deal with the propagation analysis of systematic errors. The effect of the latter can be more significant than usually expected.

“Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources.”

“Of particular interest are the effects of possible calibration errors in experimental measurements. The results are analyzed through the use of cumulative probability distributions (cdf) for the output variables of the model.

“As noted by Vasquez and Whiting (1998) in the analysis of thermodynamic data, the systematic errors detected are not constant and tend to be a function of the magnitude of the variables measured.

When several sources of systematic errors are identified, [uncertainty due to systematic error] beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

“beta = sqrt[sum over(theta_S_i)^2],

“where “i” defines the sources of bias errors and theta_S is the bias range within the error source i. (my bold)”

That is, in non-linear models the uncertainty due to systematic error is propagated as the root-sum-square.

This is the correct calculation of total uncertainty in a final result, and is the approach taken in my paper.

The meaning of ±4 W/m² Long Wave Cloud Forcing Error

This illustration might clarify the meaning of ±4 W/m^2 of uncertainty in annual average LWCF.

The question to be addressed is what accuracy is necessary in simulated cloud fraction to resolve the annual impact of CO2 forcing?

We know from Lauer and Hamilton, 2013 that the annual average ±12.1% error in CMIP5 simulated cloud fraction (CF) produces an annual average ±4 W/m^2 error in long wave cloud forcing (LWCF).

We also know that the annual average increase in CO₂ forcing is about 0.035 W/m^2.

Assuming a linear relationship between cloud fraction error and LWCF error, the GCM annual ±12.1% CF error is proportionately responsible for ±4 W/m^2 annual average LWCF error.

Then one can estimate the level of GCM resolution necessary to reveal the annual average cloud fraction response to CO₂ forcing as,

(0.035 W/m^2/±4 W/m^2)*±12.1% cloud fraction = 0.11%

That is, a GCM must be able to resolve a 0.11% change in cloud fraction to be able to detect the cloud response to the annual average 0.035 W/m^2 increase in CO₂ forcing.

A climate model must accurately simulate cloud response to 0.11% in CF to resolve the annual impact of CO₂ emissions on the climate.

The cloud feedback to a 0.035 W/m^2 annual CO2 forcing needs to be known, and needs to be able to be simulated to a resolution of 0.11% in CF in order to know how clouds respond to annual CO2 forcing.

Here’s an alternative approach. We know the total tropospheric cloud feedback effect of the global 67% in cloud cover is about -25 W/m^2.

The annual tropospheric CO₂ forcing is, again, about 0.035 W/m^2. The CF equivalent that produces this feedback energy flux is again linearly estimated as,

(0.035 W/m^2/|25 W/m^2|)*67% = 0.094%.

That is, the second result is that cloud fraction must be simulated to a resolution of 0.094%, to reveal the feedback response of clouds to the CO₂ annual 0.035 W/m^2 forcing.

Assuming the linear estimates are reasonable, both methods indicate that about 0.1% in CF model resolution is needed to accurately simulate the annual cloud feedback response of the climate to an annual 0.035 W/m^2 of CO₂ forcing.

This is why the uncertainty in projected air temperature is so great. The needed resolution is 100 times better than the available resolution.

To achieve the needed level of resolution, the model must accurately simulate cloud type, cloud distribution and cloud height, as well as precipitation and tropical thunderstorms, all to 0.1% accuracy. This requirement is an impossibility.

The CMIP5 GCM annual average 12.1% error in simulated CF is the resolution lower limit. This lower limit is 121 times larger than the 0.1% resolution limit needed to model the cloud feedback due to the annual 0.035 W/m^2 of CO₂ forcing.

This analysis illustrates the meaning of the ±4 W/m^2 LWCF error in the tropospheric feedback effect of cloud cover.

The calibration uncertainty in LWCF reflects the inability of climate models to simulate CF, and in so doing indicates the overall level of ignorance concerning cloud response and feedback.

The CF ignorance means that tropospheric thermal energy flux is never known to better than ±4 W/m^2, whether forcing from CO₂ emissions is present or not.

When forcing from CO₂ emissions is present, its effects cannot be detected in a simulation that cannot model cloud feedback response to better than ±4 W/m^2.

GCMs cannot simulate cloud response to 0.1% accuracy. They cannot simulate cloud response to 1% accuracy. Or to 10% accuracy.

Does cloud cover increase with CO₂ forcing? Does it decrease? Do cloud types change? Do they remain the same?

What happens to tropical thunderstorms? Do they become more intense, less intense, or what? Does precipitation increase, or decrease?

None of this can be simulated. None of it can presently be known. The effect of CO₂ emissions on the climate is invisible to current GCMs.

The answer to any and all these questions is very far below the resolution limits of every single advanced GCM in the world today.

The answers are not even empirically available because satellite observations are not better than about ±10% in CF.

Meaning

Present advanced GCMs cannot simulate how clouds will respond to CO₂ forcing. Given the tiny perturbation annual CO₂ forcing represents, it seems unlikely that GCMs will be able to simulate a cloud response in the lifetime of most people alive today.

The GCM CF error stems from deficient physical theory. It is therefore not possible for any GCM to resolve or simulate the effect of CO₂ emissions, if any, on air temperature.

Theory-error enters into every step of a simulation. Theory-error means that an equilibrated base-state climate is an erroneous representation of the correct climate energy-state.

Subsequent climate states in a step-wise simulation are further distorted by application of a deficient theory.

Simulations start out wrong, and get worse.

As a GCM steps through a climate simulation in an air temperature projection, knowledge of the global CF consequent to the increase in CO₂ diminishes to zero pretty much in the first simulation step.

GCMs cannot simulate the global cloud response to CO₂ forcing, and thus cloud feedback, at all for any step.

This remains true in every step of a simulation. And the step-wise uncertainty means that the air temperature projection uncertainty compounds, as Vasquez and Whiting note.

In a futures projection, neither the sign nor the magnitude of the true error can be known, because there are no observables. For this reason, an uncertainty is calculated instead, using model calibration error.

Total ignorance concerning the simulated air temperature is a necessary consequence of a cloud response ±120-fold below the GCM resolution limit needed to simulate the cloud response to annual CO₂ forcing.

On an annual average basis, the uncertainty in CF feedback into LWCF is ±114 times larger than the perturbation to be resolved.

The CF response is so poorly known that even the first simulation step enters terra incognita.

The uncertainty in projected air temperature increases so dramatically because the model is step-by-step walking away from an initial knowledge of air temperature at projection time t = 0, further and further into deep ignorance.

The GCM step-by-step journey into deeper ignorance provides the physical rationale for the step-by-step root-sum-square propagation of LWCF error.

The propagation of the GCM LWCF calibration error statistic and the large resultant uncertainty in projected air temperature is a direct manifestation of this total ignorance.

Current GCM air temperature projections have no physical meaning.

Advertisements

578 thoughts on “Emulation, ±4 W/m² Long Wave Cloud Forcing Error, and Meaning

  1. Dr Frank

    Many thanks for your fine exposition. This, and the more intelligent ripostes to your work, have been an education.

    “…The CMIP5 GCM annual average 12.1% error in simulated CF is the resolution lower limit. This lower limit is 121 times larger than the 0.1% resolution limit needed to model the cloud feedback due to the annual 0.035 W/m^2 of CO₂ forcing…”

    That’s the average. Are there any GCMs that get much closer in tracking CF? Forgive me if you have covered this already.

    • First, let me take this opportunity to once again thank Anthony Watts and Charles the Moderator.

      We’d (I’d) be pretty much lost without you. You’ve changed the history of humankind for the better.

      mothcatcher, different GCMs are parameterized differently, when they’re tuned to reproduce target observables such as the 20th century temperature trend.

      None of them are much more accurate than another. Those that end up better at tracking this are worse at tracking that. It’s all pretty much a matter of near happenstance.

      • Thank you Dr. Frank for bringing this to the world’s attention. Both error and uncertainty analysis are traditional engineering concepts. As an EE who has done design work of electronic equipment, I have a very good expectation of what uncertainty means.

        You can sit down and design a 3 stage RF amplifier with an adequate noise figure and bandwidth on paper. You can evaluate available components and how tolerance statistics affect the end result. But, when you get down to the end, you must decide, will this design work when subjected to manufacturing. This is where uncertainty reigns. Are component specs for sure? Will permeability of ferrite cores be what you expected and have the bandwidth required. Will there be good trace isolation? Will the ground plane be adequate? Will unwanted spurious coupling be encountered? Will interstage coupling be correct? And on and on. These are all uncertainties that are not encountered in (nor generally allowed for in) the design equations.

        Do these uncertainties ring any bells with the folks dealing with GCM’s? Are all their inputs and equations and knowledge more than adequate to address the uncertainties in the parameters and measurements they are using? I personally doubt it. The world wide variables in the atmosphere that are not part of the equations and models must be massive. The possibility of wide, wide uncertainty in the outputs must be expected and accounted for if the models are to be believed.

      • Pat,
        “It’s all pretty much a matter of near happenstance.” Trade offs! However, it calls into serious question the claim that the models are based on physics when they don’t have a general solution for all the output parameters and have to customize the models for the parameters they are most interested in.

        • It’s a very worthwhile paper, isn’t David.

          When I was working on the study, I wrote to Prof. Whiting to ask for a reprint of a paper in the journal Fluid Phase Equilibria, which I couldn’t access.

          After I asked for the reprint (which he sent), I observed to him that, “If you don’t mind my observation, after scanning some of your papers, which I’ve just found, you are in an excellent position to very critically assess the way climate modelers assess model error.

          Climate modelers never propagate error through their climate models, and never publish simulations with valid physical uncertainty limits. From my conversations with one or two of them, model error propagation seems to be a completely foreign idea.

          Here’s what he wrote back, “Yes, it is surprisingly unusual for modelers to include uncertainty analyses. In my work, I’ve tried to treat a computer model as an experimentalist treats a piece of laboratory equipment. A result should never be reported without giving reasonable error estimates.

          It seems modelers elsewhere are also remiss.

          • “From my conversations with one or two of them, model error propagation seems to be a completely foreign idea.”

            It’s how they are trained. They are more computer scientists than physical scientists. The almighty computer program will spit out a calculation that is the TRUTH – no uncertainty allowed.

            Most of them have never heard of significant digits or if they have it didn’t mean anything to them. Most of them have never set up a problem on an analog computer where inputs can’t be set to to an arbitrary precision and therefore the output is never the same twice.

          • “0.1When reporting the result of a measurement of a physical quantity, it is obligatory that some quantitative indication of the quality of the result be given so that those who use it can assess its reliability. Without such an indication, measurement results cannot be compared, either among themselves or with reference values given in a specification or standard. It is therefore necessary that there be a readily implemented, easily understood, and generally accepted procedure for characterizing the quality of a result of a measurement, that is, for evaluating and expressing its uncertainty.”

            https://www.isobudgets.com/pdf/uncertainty-guides/bipm-jcgm-100-2008-e-gum-evaluation-of-measurement-data-guide-to-the-expression-of-uncertainty-in-measurement.pdf

          • “They are more computer scientists than physical scientists.”

            By looking over code of ‘state of the art’ climate computer models, I can be almost sure they aren’t either.

            Not computer scientists, not physicists. They are simply climastrologers.

    • Quote: “Many thanks for your fine exposition. This, and the more intelligent ripostes to your work, have been an education.”

      This article made it for me. Clear as crystal. Reminds me of some years back I understood the fabrication of hockey sticks.

  2. Pat, thanks for the plain language explanation of your paper.

    Question: Given current and past temperature stability during ever-increasing CO2 fractions, should we not expect the probability of cloud formation to follow a normal distribution curve.

    That is to say: Scientists are duty-bound to enumerate the maximum uncertainty in the entire universe of possibilities. Nonetheless, an outcome near the mean of the reference fraction, seems far more probable, than one at either extreme.

    • There’s no reason to expect a normal curve. This is the same conceptual shortcut that leads people to incorrectly apply stochastic error corrections to systematic error. In general, the distinction is precisely that systemic effects are highly unlikely to be normally distributed.

      • Breeze,
        I’m postulating that a dramatic change is less likely to occur than one closer to a current reference.

        If I understand you correctly, you’re arguing for an equal probability for the gamut of potential outcomes. This position defies logic, as prevailing conditions indicate a muted response at best.

        Probability forecasting is based on analysis of past outcomes.

        • Robr: ” This position defies logic, as prevailing conditions indicate a muted response at best.”

          The natural variability of the just the annual global temperature average would argue against a normal probability distribution for the outcomes in our ecosphere, let alone on a regional basis. It is highly likely that there is a continuum of near-equal possibilities with extended tails in both directions. On an oscilloscope it would look like a rounded off square wave pulse with exponential rise times and fall times.

          • Tim G.,
            I beg to differ as global temperature (in the face of rapidly increased CO2) have been remarkably flat; albiet with some warming.

            Arguing for an even probability distribution, in the face of said stability, requires certitude in an exponential CF response to rising CO2

          • RobR,

            go here: http://images.remss.com/msu/msu_time_series.html

            You will see significant annual variation in the average global temperature. If you expand the time series, say by 500-700 years the variation is even more pronounced.

            “Arguing for an even probability distribution, in the face of said stability, requires certitude in an exponential CF response to rising CO2”

            Huh? How do you figure an even probability distribution requires an exponential response of any kind driven by anything? A Gaussian distribution has an exponential term. Not all probability distributions do. See the Cauchy distribution.

        • RobR,

          The normal distribution is a very specific distribution, it is not just the general idea that dramatic change is less likely to occur than one closer to current reference.

          Specifically, the central limit theorem (which justifies the assumption of normality) applies to “any set of variates with any distribution having a finite mean and variance tends to the normal distribution.” (http://mathworld.wolfram.com/NormalDistribution.html). The problem is that this describes a set of variables, not a phenomenon under investigation.

          So while the set of variables that fully describe climate will obey the theorem, and any set of explanatory variables which we may choose to invoke to explain that phenomenon will also explain it. Since our collection of explanatory variables is almost certainly incomplete for a complex problem, the two distributions won’t line up. This will skew your observation, because it then isn’t governed by a fix set of variables that obey theorem.

          That’s why Zipfian (long tailed) distribution are the norm. That means that, even though you might pin down the main drivers, the smaller ones can always have a larger effect on the outcome, since the harmonic series doesn’t converge, and the more terms you have, the quicker this can happen.

          Six sigma may work in the engineering world, but not in the world of observing complex phenomena.

    • Rob, I can’t speak at all to how the climate self-adjusts to perturbations, or what clouds might do.

      The climate does seem to be pretty stable — at least presently. How that is achieved is the big question.

    • Possibly if we believe cloud formation is caused by many independent random variables. This is central limit theorem.

      • There lies the rub: There’s no reason to believe “cloud formation is caused by many independent random variables”.

        • Surface irregularities creating local air turbulence. Humidity-driven vertical convection. Liquid water droplet (fog) density-driven vertical convection. Solar-heat internal cloud vertical convection. Water condensation via local temperatures. Solar vaporization of cloud droplets. Local air pressure variations. Wind driven evaporation of fog. Microclimate interactions. Rain. Wind-driven particulates. Ionization from electrical charges. Warmed surface convection currents. Mixing of air currents. Asymmetry throughout. Then add nighttime radiation to a non-uniform 4K night sky.

    • RobR
      “Question: Given current and past temperature stability during ever-increasing CO2 fractions, should we not expect the probability of cloud formation to follow a normal distribution curve.”

      No. We would probably find that the distribution is log normal if we looked at it. Many years ago I studied airborne radioactivity emission distributions and found that almost always the distributions were log normal.

  3. There is so much chaos, uncertainty and feedbacks in the Earth’s climate it really doesn’t matter what the climate modelers do or don’t do, they will never get it right.. or even close.. ever.

    • Yes rbabcock,
      In Nick Stokes’ recent post I alluded to that when I asked —
      “Nick,
      Your diagram of Lorenz attractor shows only 2 loci of quasi-stability, how many does our climate have? And what evisdence do you have to show this.”
      [corrected the dumb speling mastaeke]
      To which the answer was —
      Nick Stokes
      September 16, 2019 at 11:04 pm

      “Well, that gets into tipping points and all that. And the answer is, nobody knows. Points of stability are not a big feature of the scenario runs performed to date.”

      So we don’t know how many stable loci there are, nor do we know the probable way the climate moves between them. Not so much truly known about our climate then, especially the why, when, and how the climate changes direction. So just how can you possibly model long term climate with so much ignorance of the basic mechanisms?

      • “So just how can you possibly model long term climate with so much ignorance of the basic mechanisms?”

        Easy: “🎶Razzle-dazzle ’em!🎶”

      • I have been wondering this for years.

        Nick said, “Points of stability are not a big feature…to date.”

        They are everything in a complex non-linear system. We simply cannot assess what is “not normal” if we do not know what is normal, and understand the bounds of “normal” – i.e., for each loci there will be a large number (essentially an infinite number) of bounded states available, all perfectly “normal,” and at some point, the system will flip to another attractor.

        We know of two loci over the last 800,000 years: quasi-regular glacial and interglacials (e.g., see NOAA representation here: https://www.climate.gov/sites/default/files/PaleoTemp_EPICA_large.png).

        We really can’t model “climate” unless we can model that.

  4. But the GCMs are so beautiful we must keep using them even if they disagree with reality. We can’t have spent all this money on nothing.

    Honestly, the GCMs have failed to accurately predict anything for the past 30 years. Have we found the main root cause or is this 1 of a 100 causes in the chain any on of which causes them to fail?

    • “We can’t have spent all this money on nothing.”

      You are correct. It was spent to justify arbitrary taxation by the UN, politicians and bureaucrats across the globe. Everyone responsible for this scam should be forced to find real work, especially the politicians.

      • Eric Barnes
        ” Everyone responsible for this scam should be forced to find real work, …”
        Like breaking rocks on a chain gang!

    • “But the GCMs are so beautiful we must keep using them even if they disagree with reality. We can’t have spent all this money on nothing.”
      That may be the funniest comment I have heard this month!
      Bravo!
      Unfortunately, like much comedy, it is also incredibly sad when you really think about it for a while.

  5. Dr Frank,

    I think you have taken a wise path by directly addressing the concern raised by Dr. Spencer (i.e., that with each model iteration, inputs change in response to model outputs and those changes are not adequately modeled), Well done.

    • I agree. While Franks explanation is rather wordy for us lay folk, I noted to Nick Stokes, the difference between a CFD equation used in the normal sense, and the application of CFD in weather and GCMs, is that industrial CFDs use known parameters to calculate the unknown. GCMs use a bunch of unknowns to calculate a desired outcome.. It is a matter of days before the error in the CFD/GCM used in weather “blows up” as Nick put it … and that is because the parameters become more and more unknown over time. It stands to reason, if the equation blows up in a few days, it doesn’t stand a chance at calculating 100 years.

      As such, all of the “gargoyles” of a GCM are nothing more than decoration. The reality is, the GCM ends up being a linear calculation based on the creators input of an Estimated Climate Sensitivity to CO2. Frank just made a linear equation using the same premise. For that matter, the other part of Franks paper dealing with a GCM calculating CF as a function of CO2 forcing is a false assumption. We have no empiric proof that CO2 forcing has doodley squat impact on CF. … thus its imaginary number for LWCF is just fiction. But … it doesn’t matter, it is the ECS that counts.

      Good Job Dr. Frank

      • “that industrial CFDs use known parameters to calculate the unknown”
        I don’t think you know anything about industrial CFD. Parameters are never known with any certainty. Take for example the modelling of an ICE, which is very big business. You need a model of combustion kinetics. Do you think that is known with certainty? You don’t, of course, even know the properties of the fuel that is being used in any instance. They also have a problem of modelling IR, which is important. The smokiness of the gas is a factor.

        Which raises the other point in CFD simulations; even if you could get perfect knowledge in analysing a notional experiment, it varies as soon as you try to apply it. What is the turbulence of the oncoming flow for an aircraft wing? What is the temperature, even? Etc.

        “It is a matter of days before the error in the CFD/GCM used in weather “blows up” as Nick put it”
        No, it isn’t, and I didn’t. That is actually the point of GCM’s. They go beyond the time when weather can be predicted, but they don’t blow up. They keep calculating perfectly reasonable weather. It isn’t a reliable forecast any more, but it has the same statistical characteristics, which is what determines climate.

        • LOL!!!

          Stokes thinks our climate is as simple as an ICE! Priceless!

          Nick, sometimes it is best to say nothing, rather than prove yourself to be a zealot and a fool. Please learn this lesson, and grow up.

          See Nick. See shark. See Nick jump shark. Jump Nick, jump! LOL

        • “…You don’t, of course, even know the properties of the fuel that is being used in any instance…”

          Of course you do. You think GM or BMW or whoever doesn’t know if it will be fed by E85, kerosene, Pepsi, or unicorn farts?

          “…That is actually the point of GCM’s. They go beyond the time when weather can be predicted, but they don’t blow up. They keep calculating perfectly reasonable weather…”

          Climate models don’t do weather.

          • could set up a bounded random weather generator that would do the same thing, doesn’t mean that it has any practical application

          • “GM or BMW or whoever doesn’t know”
            They don’t know exactly what owners are going to put in the tank in terms of fuel type, octane number etc. But there are a lot of properties that matter. Fuel viscosity varies and is temperature dependent. It’s pretty hard to even get an accurate figure to put into a model, let alone know what it will be in the wild. Volatility is very dependent on temperature – what should it be? Etc.

          • Due to the needs of engineering, a great deal of experimental work goes in to determining relevant physical constants of the materials. Density and viscosity of various possible fuels as a function of temperature is no different. Several examples are

            https://www.researchgate.net/publication/273739764_Temperature_dependence_density_and_kinematic_viscosity_of_petrol_bioethanol_and_their_blends

            and
            https://pubs.acs.org/doi/abs/10.1021/ef2007936

            A big difference between inputs to engineering models and climate models is that such fundamental, needed, inputs are obtainable from controlled laboratory experiments, with well characterized uncertainties. When such values are used in engineering models, error propagation can proceed with confidence.

          • “A big difference between inputs to engineering models and climate models is that such fundamental, needed, inputs are obtainable from controlled laboratory experiments, with well characterized uncertainties”

            Ha! From the abstract of your first link:
            ” The coefficients of determination R2 have achieved high values 0.99 for temperature dependence density and from 0.89 to 0.97 for temperature dependence kinematic viscosity. The created mathematical models could be used to the predict flow behaviour of petrol, bioethanol and their blends.”

            0.89 to 0.97 may be characterised, but it isn’t great. But more to the point, note the last. They use models to fill in between the sparse measurements.

          • “…They don’t know exactly what owners are going to put in the tank in terms of fuel type, octane number etc…”

            Amazing. You’re doubling-down on possibly the dumbest comment I have ever read on this site (and that includes griff thinking 3 x 7 = 20).

            Gasoline refiners and automakers have had a good handle on this for a few decades. Things are pretty-well standardized and even regulated. Running a CFD on a few different octane ratings is nothing. And then a prototype can be tested using actual fuel (which is readily available). Manufacturer’s will even recommend a certain octane level (or higher) for some vehicles.

            You really think engineers just throw their hands up in the air and start designing an ICE considering every sort of fuel possibility and then hope they get lucky? They design based on a fuel from the start.

          • “Running a CFD on a few different octane ratings is nothing. “
            Yes, that’s how you deal with uncertainty – ensemble.

            “You really think engineers just throw their hands up in the air and start designing an ICE”
            I’m sure that they design with fuel expectations. But they have to accommodate uncertainty.

          • 0.89 to 0.97 may be characterised, but it isn’t great.

            Written in all seriousness.

            Then again this is the same individual who thinks the atmosphere is a heat pump and not a heat engine. *shrug*

          • “…Yes, that’s how you deal with uncertainty – ensemble…”

            These are discrete analyses performed on specific and standardized formulas with known physical and chemical characteristics. The ICE design is either compatible and works with a given formulation or it doesn’t.

            Unlike most climate scientists, engineers work with real-world problems. Results need to be accurate, precise, and provable. Ensembles? Pfffft. Save those for the unicorn farts.

          • ” They don’t know exactly what owners are going to put in the tank in terms of fuel type, octane number etc. ”

            Balderdash, my BMW manual tells me what the optimum fuel is for use in my high performance engine … why would an owner even contemplate using inferior fuel ?

            Maybe your old banger still uses chip oil, eh Nick ?

        • While you’re here Nick would it be fair to say what Kevin Trenberth said in journal Nature (“Predictions of Climate”) about climate models in 2007 still stands —

          None of the models used by the IPCC are initialized to the observed state and none of the climate states in the models correspond even remotely to the current observed climate. In particular, the state of the oceans, sea ice and soil moisture has no relationship to the obsered state at any recent time in any of the IPCC models. There is neither an El Nino sequence nor any Pacific Decadal Oscillation that replicates the recent past; yet these are critical modes of variability that affect Pacific rim countries and beyond. The Atlantic Multidecadal Oscillation, that may depend on the thermohaline circulation and thus oceanic currents in the Atlantic, is not set up to match today’s state, but it is a critical component of the Atlantic hurricanes and it undoubtedly affects forest for the next decade from Brazil to Europe. Moreover, the starting climate state in serveral of the models may depart significantly from the real climate owing to model errors. I postulate that regional climate change is impossible to deal with properly unless the models are initialized.
          ¯

          ¯
          Therefore the problem of overcoming this shortcoming, and facing up to initializing climate models means not only obtaining sufficiently reliable observations of all aspects of the climate system, but also overcoming model biases. So this is a major challenge.

          Are the model initialize with realistic values before they are run?

          • “would it be fair to say”
            Yes, and I’ve been saying it over and over. Models are not initialised to current weather state, and cannot usefully be (well, people are trying decadal predictions, which are something different). In fact, they usually have a many decades spin-up period, which is the antithesis of initialisation to get weather right. It relates to this issue of chaos. The models disconnect from their initial conditions; what you need to know is how climate comes into balance with the long term forcings.

        • Dear Dr. Nick Stokes (In deference to your acolytes who complained I wasn’t being sufficiently reverential.)

          You said, “They keep calculating perfectly reasonable weather. It isn’t a reliable forecast any more, …” I didn’t think that there was so much difference between Aussie English and American English that your you would consider unreliable to be reasonable.

        • Ummm … Nick … I may not be as well versed about CFDs as you, but I’m pretty proficient when in comes to ICE. There is this thing called an ECM, and it hooks to the MAP, the O2 sensor, and a few others …. and as such, the ECM adjust the behavior based on known incoming data. That is how you get multi fuel ICE, … the ECM makes the adjustments to fuel mix and timing to insure that the engine is running within preset parameters that were determined using … drum role … KNOWN PARAMETERS. I would guess that a CFD was used to model the burn with those known parameters in the design phase, such that they can pre-program the ECM to make its adjustments … but again … you are talking about a live data system, with feedback and adjustment. The adjustments are made using KNOWN data and KNOWN standards.

          But what I was talking about, is in fluid dynamics, you know the viscosity of the liquid, you know the diameter of the pipe, you know the pressure, you know the flow rate … you KNOW a lot of things … then your CFD shows how the liquid moves within the system. A GCM can take the current data, pressure systems, temperatures, etc, and calculate how a weather system is likely to move, however, the certainty of those models, as any Meteorologists will tell you becomes less certain with each passing day …. and when you get out to next week, it’s pretty much just a guess. A GCM run for next year is doing nothing but using averages over time. Problem with those averages, is they have confidence intervals, there are standard deviations for each parameter. The CF for Texas in July will be for example 30% plus or minus 12%., thus solar insolation at the surface will be X w/m2, plus or minus Y w/m2, Pressure will be … , etc etc etc. The problem with GCMs in Climate is they do not express their results as plus or minus the uncertainty. They give you a single number, then they run the same model 100 times. THIS IS A FLAWED way of doing things. In order to get a true confidence interval for these calculation, you have to run the model with all possible scenarios, which will be a probability in the 10s of thousands per grid point using the uncertainty ranges for each parameter, and that for each time interval, …. which would be a guaranteed “blow up” of the CFD. Further, the error Frank points to will begin to be incorporated into the individual parameter standard deviations, … ie., a systematic error in the system. Kabooom.

          That is why its just easier to make a complicated program that in the end, cancels out all the parameters and uses the ECS …… same as a linear equation using the ECS.

          Just sayin.

          • “you know the viscosity of the liquid”
            I wish. You never do very well, especially its temperature dependence, let alone variation with impurities.
            “you know the pressure, you know the flow rate”
            Actually, mostly the CFD has to work that out. You may know the atmospheric pressure, and there may be a few manometers around, but you won’t have a proper pressure field by observation.

        • One needs to be clear about what one means by “model.” When it is used in the engineering sense of fitting a proposed equation to experimental data, such as the data for physical constants, “model” literally means that, a specific determinate, equation with specific parameter choices. (In the climate science regime, a more proper word would be “simulation.”) In engineering, the equation may be theoretically motivated or may simply be an explicit function that captures certain kinds of behavior such as a polynomial. That equation is then “fit” to the data and the variation of the measured values from the predicted values is minimized by that simple fit. Various conditions usually are also tested to see if the deviations from the fit correspond well or ill to what would be expected of “random” errors, such as a normal distribution. Such tests can typically show if there is some non-random systematic error either due to the apparatus or a mis-match of the equation to the behavior.
          The Rsquared value provides the relative measure of the percentage of the dependent variable variance that the equation explains (from that particular experiment) over the range for which it was measured, under the conditions (i.e. underlying experimental uncertainties) of the experiment. In other words it is a measure of how well the function seems to follow the behavior of the data. A value of 0.9 in general is considered quite good in the sense that that particular equation is useful. Further, the values that come from the fit, including the uncertainty estimates, can be used directly in an error propagation of a calculation that uses the value. A more meaningful quantity for predictive purposes is the standard error of the regression, which gives the percent error to be expected from using that equation, with those parameters, to predict an outcome of another experiment. By that measure, engineering “models” typically are not considered useful unless the standard errors are less than a few percent, as is the case for the particular works cited. Experimental procedures are generally improved over time which generally reduce the uncertainties and provide more accurate (lower uncertainties).

          By contrast, one should compare the inputs to climate simulations. Of particular interest would be what is the magnitude of the standard errors associated with them, whether estimates of them are point values, or functional relations, and whether the variations from the estimate used are normally distributed or otherwise.

        • Nick writes

          They go beyond the time when weather can be predicted, but they don’t blow up.

          They do unless their parameters are very carefully chosen. But choosing those parameters isn’t about emulating weather, its about cancelling errors.

          And then

          It isn’t a reliable forecast any more, but it has the same statistical characteristics, which is what determines climate.

          The only characteristics that might be considered “reliable” are those near to today. There is no reason to believe future characteristics will be reliable and in fact the model projections have already shown them not to be reliable.

      • Slight correction to the previous comment: where it said the standard error measures how well the function predicts the outcome of a future measurement, it should have included the words “in an absolute sense” meaning that it gives the expected percent error in the predicted future value. In that sense it can be used directly in a propagation of error calculation. Rsquared does not serve that purpose since it only measures the closeness to which the functional form conforms to the functional form of the data.

  6. “even the first simulation step enters terra incognita.”

    Dr. Frank–

    Your propagation of error graphs are hugely entertaining to me, showing the enormous uncertainty that develops with every step of a simulation. This is shocking to the defenders of the consensus and they bring up the artillery to focus on that wonderful expanding balloon of uncertainty.

    But now with your new article on the error in the very first step, why bother propagating the error beyond the first step? Let the defenders defend against this simpler statement.

    • Lance Wallace

      There have been complaints that the +/- 4 W uncertainty should only be applied once.

      Arbitrarily select a year (Tsub0) of unknown or assumed zero uncertainty for the temperature to apply the uncertainty of cloud fraction. The resultant predicted temperature of year Tsub1, now has a known minimum uncertainty. Now, since we selected the initial year arbitrarily, what is to prevent us from repeating the selection, this time using year Tsub1? Doing so, we are required to account for the uncertainty of the cloud forcing just as we did the first time, only this time we know what the minimum uncertainty of the temperature is before doing the calculation! We can continue to do this ad infinitum. That is, as long as there is an iterative chain of calculations, the known uncertainties of variables and constants must be taken into account every time the calculations are performed. It is only systematic offsets or biases that need only be adjusted once.

      A bias adjustment of known magnitude will affect the nominal value of the calculation, but not the uncertainty. However, the uncertainty, which has a range of possible values, will not affect the nominal value, but WILL affect the uncertainty.

      • Clyde,
        “Now, since we selected the initial year arbitrarily, what is to prevent us from repeating the selection, this time using year Tsub1?”
        And for year, read month. Or iteration step in the program (about 30 min). What is special about a year?

        In fact, the 4 Wm⁻² was mainly spatial variation, not over time.

        • Stokes,
          The point being that the calculations cannot be performed in real time, nor at the spatial resolution at which the measured parameters vary. Therefore, coarse resolutions of both time and area have to be adopted for practical reasons.

          Other than the fact that 1 year nearly averages out the land seasonality, it could be a finer step. However, using finer temporal steps not only increases processing time, but increases the complexity of the calculation by having to use appropriate variable values for the time steps decided on. But, as is often the case with you, you have tossed a red herring into the room. Using finer temporal resolutions does not get around the requirement of accounting for the uncertainty at every time-step.

  7. The problem with CF is just the tip of the iceberg. Also consider high altitude water vapor. As theorized by Dr. William Gray, it must also change due to any surface warming and/or changes in evaporation rates. Once again the models cannot track these changes at the needed accuracy to produce a valid result.

    Just how many more of these factors exist? Each one adds exponentially to the already impossible task facing models.

  8. It is entirely obvious that GCMs disguise a (relatively) simple relationship between CO2 and temperature with wholly unnecessary (and necessarily inaccurate) complexity. A couple of lines on an Excel sheet will do just as well in terms of forecasting temperature – delta CO2, sensitivity of temperature to changes in CO2, one multiplied by the other. All you need to know to forecast temperature with any skill is the sensitivity. GCMs can then be run to show the effects on the climate of the increase in temperature.

    But a simple model does not work, because a single sensitivity figure will not both hindcast and forecast accurately, even of a long term trend that allows for a significant amount of natural variation. So we get these huge GCMs which do not improve the situation but allow modelers to hide the problem, which is that any sensitivity to CO2 is swamped by natural variation at every level we can model the physical properties of the climate.

    The fundamental problem remains, that sensitivity to CO2 is still unclear, and estimates vary widely. Yet that sensitivity is absolutely the key thing we need to know if we are to understand and forecast the impact of increased CO2 on the climate. If we actually knew sensitivity, and GCMs were a reasonable approximation of our climate, there would be no debate amongst climate scientists and no reasonable basis for scepticism.

    • Phoenix44: “The fundamental problem remains, that sensitivity to CO2 is still unclear”

      Precisely! We can’t even determine how to derive CO2 sensitivity on Mars where it makes up 95% of the atmosphere. We don’t even know what to measure so that we can begin to derive the answer.

      “If we actually knew [CO2] sensitivity, and GCMs were a reasonable approximation of our climate”

      Then we could prove out the GCM by inputting Mars’ parameters and verifying the resulting output.

      It truly is glaring how pitiful the “science” is that has been dedicated to deriving CO2 sensitivity.

    • Simple and complex GCMs give the same values for climate sensitivities and also for the warming values of different RCP’s. There is no conflict. Or can you show an example that they give different results?

      • Yes. It’s all offsetting errors, Antero. Also known as false precision.

        Kiehl JT. Twentieth century climate model response and climate sensitivity. Geophys Res Lett. 2007;34(22):L22710.

        http://dx.doi.org/10.1029/2007GL031383

        Abstract: Climate forcing and climate sensitivity are two key factors in understanding Earth’s climate. There is considerable interest in decreasing our uncertainty in climate sensitivity. This study explores the role of these two factors in climate simulations of the 20th century. It is found that the total anthropogenic forcing for a wide range of climate models differs by a factor of two and that the total forcing is inversely correlated to climate sensitivity. Much of the uncertainty in total anthropogenic forcing derives from a threefold range of uncertainty in the aerosol forcing used in the simulations.

        p.2 “Note that the range in total anthropogenic forcing is slightly over a factor of 2, which is the same order as the uncertainty in climate sensitivity. These results explain to a large degree why models with such diverse climate sensitivities can all simulate the global anomaly in surface temperature. The magnitude of applied anthropogenic total forcing compensates for the model sensitivity.

  9. An impressive and insightful treatise of errors and error propagation in the context of GCM. Thank you for explaining in plain language.

  10. Just a few sentences in to this article, and I think I just had my “Aha!” moment I have been waiting and hoping for.
    A black box.
    That crystalizes it very clearly.

    • I think it would be more appropriately called …. [The Wiggly Box]. This is what a GCM represents.

      dT=dCO2*[wiggly box]*ECS

      Where dCO2 * ECS establishes the trend … and [wiggly box] adds the wiggles in the prediction line to make it look like a valid computation.

  11. Dr. Frank: “beta = sqrt[sum over(theta_S_i)^2],
    “where “i” defines the sources of bias errors and theta_S is the bias range within the error source i. (my bold)”

    Is it an issue that you are only evaluating a single bias error source; whereas, for the reasons previously discussed, we know the net energy budget to be essentially accurate. Errors in the LWCF are demonstrably being offset by errors in other energy flux components at each time step such that the balance is maintained (?)

    • How can introducing another “error” increase your understanding of the real world.

      It doesn’t.

      Does it allow your model to better emulate a linear forcing model? Yes. Does it model the physical world? No.

    • S. Geiger, please call me Pat.

      I’m evaluating an uncertainty based upon the lower limit of resolution of GCMs. Off-setting errors do not improve resolution. They just hide the uncertainty in a result.

      Off-setting errors do not improve the physical description. They do not ensure accurate prediction of unknown states.

      In a proper uncertainty analysis, offsetting errors are combined into a total uncertainty as their root-sum-square.

    • An energy budget in balance in a theoretical computation of decidedly many unknowns means what? Since the historical record indicates rather large changes in cycles of 60 to 70 years, 100 years, 1000 years, 2500 years, 24000 years, 40,000 years, etc. what is the logic that says any particular range of years should find a balance in input vs output? Perhaps pre-tuning for any particular short period intrinsically introduces a large bias because that balance is not there is reality.

      • Assuming a balance exists integrated over a time span of anything shorter than a few centuries is crazy to assume. The obvious existence of far longer period (than centuries) oscillations guarantees periodic imbalances of inputs and outflows on grand scales.

  12. Quote:
    “I posted a long extract from relevant literature on the meaning and method of error propagation, here. Most of the papers are from engineering journals.”

    I have worked 40 years in product development using measurements and simulations to improve rock drilling equipment and heavy trucks and buses.

    We have a saying:
    “When a measurement is presented no one but the performer thinks it is reliable, but when a simulation is presented everyone except from the performer thinks it is the truth.”

    Thank you for publishing results that I hope will call out the modellers to give evidence of the uncertainties in their work!

  13. If I understand this correctly:

    I find a stick on the ground that’s about a foot long in length. It’s somewhere between 10 and 14 inches long. I decide to measure the length of my living to the nearest 1/16th of an inch using that stick.

    My living room is exactly 20 “stick lengths” long. I can repeat the measurement experiment 10 times and get 20 sticks long. My wife can repeat the measurement experiment and get 20 sticks long. A stranger off the street gets the same result. But despite dozens of repeated experiments, I can’t be sure, at a 1/16 of an inch accuracy, how long my living room really is because of the uncertainty in the length of that stick.

    I need a properly calibrated stick, a ruler or a tape measure, before I can claim to know how long my living room is at that level of accuracy. If your device used to measure something is garbage, the results are going to be incredibly uncertain.

    • “I need a properly calibrated stick, a ruler or a tape measure, before I can claim to know how long my living room is at that level of accuracy.”
      No, all you need is to find out the length of the stick in metres. Then you can take all your data, scale it, and you have good results. That is of course what people do when they adjust for thermal expansion of their measures, for example.

      The point relevant here is that the error didn’t compound. It was out by a constant factor, even with multiple measures.

      • Nick Stokes: No, all you need is to find out the length of the stick in metres.

        That’s true. All you ever have to do to remove uncertainty is learn the truth. For a meter stick, however, the measurement of its length will have still an unknown error. Uncertainty will be smaller than in Jim Allison’s description of the problem, maybe only +/- 0.1mm; that uncertainty compounds. He can now measure the room with perhaps only 1/16″ error.

        For Pat Frank’s GCM, of which the unknown meter stick is an analogy , what is needed is a much more accurate estimate of the cloud feedback, which may not be available for a long time. It’s as though there is no way to measure the meter stick.

      • In this analogy, there was no error – everyone measured exactly 20 lengths. The measurement experiment delivered consistent results – it was a very precise experiment.

        The uncertainty did compound though. Each stick length added to the overall uncertainty of the length of the room. Something that is exactly as long as the stick would be 10-14 inches long. If something is two “lengths’ long, it would be 20-28 inches long. Twenty “lengths” means that the room is anywhere between 200-280 inches long. See how the more you measure with an uncertain instrument, the more the uncertainty increases? This is accuracy, a different beast than precision.

        I agree with you that if we could remove the uncertainty of the stick, and cloud formation, than we could find out exactly how long my room is and have better climate models. But billions of dollars later, we’re still scratching our heads at how much carpet I need to buy, and what climate will do in the future.

      • “The point relevant here is that the error didn’t compound. It was out by a constant factor, even with multiple measures.”

        Huh? Did you stop to think for even a second before posting this? If the stick is off by an inch then the first measurement will be off by an inch. The second measurement will be off by that first inch plus another inch! And so on. The final measurement will be off by 1inch multiplied by the number of measurements made! The error certainly compounds!

        It is the same with uncertainty. If the uncertainty of the first measurement is +/- 1 inch then the uncertainty of the second measurement will be the first uncertainty doubled. If the uncertainty of the first measurement is +/- 1 inch, i.e. from 11inches to 13inches, then the second measurment will have an uncertainty of 11inches +/- 1inch, i.e. 10inches to 12 inches coupled with 13inches +/- 1inch or 12 inches to 14inches. So the total uncertainty will range from 10 inches to 14 inches, or double the uncertainty of the first measurement. Sooner or later the uncertainty will overwhelm the measurement tool itself!

        • Its simply a matter of units. If the stick is in fact marked with some antique marking like feet and inches, you’d still have a consistent measure, with no compounding. And when you looked up, in some dusty tome, the units conversion factor, you could do a complete conversion. Or if your carper supplier happened to remember these units, maybe he could still meet your order.

          • “Its simply a matter of units. If the stick is in fact marked with some antique marking like feet and inches, you’d still have a consistent measure, with no compounding.”

            You *still* don’t get it about uncertainty, do you? It’s *not* just a matter of units. You are specifying the total length in *inches*, not in some unique unit. The uncertainty of the length of the measuring stick in inches is the issue. Remember, there are no markings on the stick, it is assumed to be 12 inches long no matter how long it actually is, that is the origin of the uncertainty interval! And that uncertainty interval does add with each iteration of using the stick!

            I don’t believe that you can be this obtuse on purpose. Are you having fun trolling everyone?

          • Nick Stokes –> “you’d still have a consistent measure, with no compounding”. You are still dealing with error, not uncertainty.

            Measurement errors occur when you are measuring the same thing with the same device multiple times. You can develop a probability distribution and calculate a value of how accurate the mean is and then assume this is the “true value”. This doesn’t include calibration error or how close the “true value” as measured is to the defined value.

            You’ll note that these measurement errors do not compound, they provide a probability distribution. However, the calibration uncertainty is a systemic error that never disappears.

            The mean also carries the calibration uncertainty since there is no way to eliminate it. In fact, it is possible that the uncertainty CAN compound. It can compound in either direction or it is possible that the uncertainties DO offset. The point is that you have no way to know or determine the uncertainty in the final value. The main thing is measuring the same thing, multiple times with the same device.

            Now, for compounding, do the following, go outside and get a stick from a tree. Break it off to what you think is 12 inches. Visually assess the length and write down what you think may be the uncertainty value as compared to the NIST standard for 12 inches. Then go make 10 consecutive measurements to obtain a length of 120 inches. How would you convey the errors in that? You could assume the measurement errors offset because you may overlap the line you drew for the previous measurement one time while you missed the line the next time so that the errors offset. It would be up to you do the sequence multiple times and assess the differences and define a +/- error component to the measurement error.

            Now, how about calibration error. Every time you take a sequential measurement the calibration error will add to the previous measurement. Why? When you check your stick against the NIST standard lets assume that it comes out 13 inches (+/- an uncertainty by the way). Your end result will be 1 inch times 10 measurements = 10 inches off. The 1 inch adds each time you make a sequential measurement! This is what uncertainty does.

          • Jim Gorman
            Not only is there an inherent potential error in the assumed length of the measuring stick, which is additive or systematic, but the error in placing the stick on the ground is a random error that varies every time it is done, and is propagated as a probability distribution.

        • Actually, in the sense of climate models, this isn’t relevant.

          Take a metal rod, we measure it with a ruler we believe to be 100cm but we’re uncertain so it’s really 90cm +/- 10cm. The rod measures 10 ruler lengths so it’s length is 1000cm +/-100cm. That’s quite a bit of uncertainty!

          Now, we paint the stick black and leave it in the sun (the black paint being analogous to adding co2). We remeasure and find it 10cm longer. Since this is less than the +/- 100cm original length of the rod it’s claimed that the uncertainty overwhelms the result.

          But no, we can definitively say it’s longer and, if the expansion is roughly proportional, the uncertainty of the expansion is +/- 1cm.

          The GCM’s don’t stand on their own. They are also calibrated (which from what Pat quotes below suggests the uncertainty doesn’t propagate) and we only want to know the difference when varying a parameter (in this case co2). So putting the uncertainty of an uncalibrated model around the difference from 2 runs of a calibrated model doesn’t seem to be relevant. Perhaps that’s why he has the +/-15C around the 3C anomoly but can’t explain exactly what it means?

          • Brigitte, it is know there is a +/- 12.1% error in annual cloud fraction in the GCMs. Put another way, the models include a cloud fraction that could be 12.1% higher or lower than reality. The GCMs apply a cloud fraction at every iterative step, i.e. annually. So the resulting temperature after the first iteration is affected by this +/- 12.1% in CF with maps to a +/- 4 w m-2 in forcing. Then in year 2 the GCM applies a cloud fraction which again has the +/- 12.1% error which further affects the temperature output in year 2. And again in year 3, etc. In this way, that one imperfection in understanding of cloud fraction propagates into huge uncertainty in the output temperature.

  14. Quote from the story: “We know the total tropospheric cloud feedback effect of the global 67% in cloud cover is about -25 W/m^2.” That is not cloud feedback, it is cloud forcing. They are two different concepts. Cloud forcing is the sum of two opposite effects of clouds on the radiation budget on the surface. Clouds reduce the incoming solar insolation in the energy budget (my research studies) from 287.2 W/m^2 in the clear sky to 240 W/m^2 in all-sky. At the same time, the GH effect increases from 128.1 W/m^2 to 155.6 W/m^2, and thus the change is +27.2 W/m^2. The net effect of clouds in all-sky conditions is cooling by -19.8 W/m^2. A quite big variety of cloud forcing numbers can be found in the scientific literature from -17.0 to -28.0 W/m^2.

    Cloud feedback is a measure in which way cloud forcing varies when the climatic conditions change along the time and mainly in which way it varies according to varying global temperatures.

    IPCC has written this way in AR4, 2007 (Ch. 8, p. 633): “Using feedback parameters from Figure 8.14, it can be estimated that in the presence of water vapour, lapse rate and surface albedo feedbacks, but in the absence of cloud feedbacks, current GCMs would predict a climate sensitivity (±1 standard deviation) of roughly 1.9°C ± 0.15°C (ignoring spread from radiative forcing differences).” My comment: this is TCS, which is a better measure of warming in the century-scale than ECS.

    In AR5 the TCS value is still in the range from 1.8°C to 1.9°C. It means that according to IPCC cloud forcing has been the same since AR4 and there is no cloud feedback applied. It means that the IPCC does not know in which way cloud forcing would change as the temperature changes.

    I think that Pat Frank is driving himself into deeper problems.

    • Antero, your post contradicts itself.

      You wrote, “Quote from the story: “We know the total tropospheric cloud feedback effect of the global 67% in cloud cover is about -25 W/m^2.” That is not cloud feedback, it is cloud forcing.

      Followed by, “Cloud feedback is a measure in which way cloud forcing varies when the climatic conditions change along the time and mainly in which way it varies according to varying global temperatures.

      So, net cloud forcing is not a feedback, except that it’s a result of cloud feedback. Very clear, that.

      I discussed net cloud forcing is the over all effect of cloud feedback; pretty much the way it’s discussed in the literature (<a href="https://doi.org/10.1175/1520-0442(1992)0052.0.CO;2“>Hartmann, et al., 1992, for example). Your explanation pretty much agrees with it, while your claim expresses disagreement.

      It seems to me you’re manufacturing a false objection.

  15. This has to be a media blitz about this. These alarmists are pouring it on thick right right now. The claim that the turning point has arrived is true in one regard. They should be shown as the frauds they are, and that is they who have run out of time with their shams.

    This absolutely must become major news.

  16. Dr. Frank,

    This is very helpful. Bottom line, I take this to mean that even if one conceded that GCM tuning eliminated errors over the interval where we supposedly have accurate GAST data, the limited resolving power of the models means that projections of temperature impacts solely related to incremental CO2 forcing are essentially meaningless. Thank you.

    • You’ve pretty much got it, Frank. Tuning the model just gets the known observables right. It doesn’t get the underlying physical description right.

      That means there’s a large uncertainty even in the tuned simulation of the known observables, because getting them right tells us little or nothing about the physical processes that produced them.

      It doesn’t solve the underlying problem of physical ignorance.

      And then, when the simulation is projected into the future, there’s no telling how far away from correct the predictions are.

  17. Pat F., I like your style! Accumulated errors are the bain of scientists in whatever sector they reside. Yes, I have had occasions where I predicted a great gold ore intercept and when the drill core was coming out, was forced to say “what the hell is this?”. What a complex topic the climate is in general, and then try to add anthropogenic forcing and feedback and it’s impossible.

  18. air temperature projections of advanced GCMs are just linear extrapolations of fractional greenhouse gas (GHG) forcing.

    This is obvious to anybody that has given thought to how GCMs have been programmed. GHGs are the only “known” factor affecting climate that can be projected. GCMs take the different GHG concentration pathways (RCPs) and project a temperature change. Although very complex, GCMs produce a trivial result.

    Current GCM air temperature projections have no physical meaning.

    Yet they appear to be dominating the social and political discourse. In the Paris 2015 agreement most nations on Earth agreed to commit towards limiting global warming partly by reducing GHG emissions. The main evidence that GHGs are responsible for the warming are GCMs.

    When most people believe [in] something, the fact that it is [not] real is small consolation to those that don’t believe. In this planet you can lose your head for refusing to believe [in] something that is not real. Literally.

  19. “We know from Lauer and Hamilton, 2013 that the annual average ±12.1% error in CMIP5 simulated cloud fraction (CF) produces an annual average ±4 W/m^2 error in long wave cloud forcing (LWCF).

    We also know that the annual average increase in CO₂ forcing is about 0.035 W/m^2.

    On an annual average basis, the uncertainty in CF feedback into LWCF is ±114 times larger than the perturbation to be resolved.

    The CF response is so poorly known that even the first simulation step enters terra incognita.”

    Exactly, the effect of man-made CO2 is so small, relative to the error in estimating low-level cloud cover, it can never be seen or estimated. I think everyone agrees with that. To me, Dr. Stokes and Dr. Spencer are clouding (pun fully intentional) the issue with abstruse, and mostly irrelevant arguments.

  20. “We also know that the annual average increase in CO₂ forcing is about 0.035 W/m^2.”

    The GCM operators have pulled this number from obscurity. This flux cannot be calculated from First Principles, so where did they get it? I assume it is based on the increase in CO2 since 1880, or some year, and the increase in Global Average Surface Temperature from 1880, or some year. These are hugely unscientific assumptions.

    • It is a good question, from which source Pat Frank has taken this CO2 forcing value 0.035 W/m^2. As we know very well, the IPCC hs used since TAR the CO2 forcing equation from Myhre et al. and it simply
      RF = 5.35 * ln (CO2/280), where CO2 is the CO2 concentration in ppm.

      This equation gives the climate sensitivity forcing of 3.7 W/m^2, which is one of the best-known figures among climate change scientists. Gavin Schmidt calls it “a canonic” figure meaning that its correctness is beyond any questions. The concentration of 400 ppm gives RF-values 1.91 W/m^2. According to the original IPCC science, the CO2 concentration has increased to this present number from the year 1750. It means a time span of 265 years till 2015 and an average annual increase of 0.0072 W/m^2. So, it is a very good question from which source comes this figure 0.035 W/m^2?

      • I figured it out, maybe. If an annual CO2-increase is 2,6 ppm, then the annual RF-value for CO2 is 0.035 W/m^2. It cannot be called an average value but it is in the high end of the present CO2 annual growth variations. A long term annual CO2 growth rate has been something like 2.2 ppm.

        • It’s the average change in GHG forcing since 1979, Antero.

          The source of that number is given in the abstract and on page 9 of the paper.

          • So I read Page 9 of your paper. Language a bit thick, but it did mention temperature records, sounded like from 1900 or thereabouts, or was it 1750?

            So, all the GCM’s ignore the concept of Natural Variation, ascribe all warming from some year to CO2 “Forcing?”

            This is not science, this is Chicken Little, “The Sky is Falling!!!”

            What if it starts to cool a bit? GCM’s pack it in and go home?

            This is ludicrous, and I do not mean Tesla’s Ludicrous Speed…

          • Michael, it’s just the sum of all the annual increases in GHG forcing since 1979, divided by the number of years.

            That gives the average annual increase in forcing since 1979.

          • “Michael, it’s just the sum of all the annual increases in GHG forcing since 1979, divided by the number of years.

            That gives the average annual increase in forcing since 1979.”

            This tells us nothing. How is a GHG “forcing” measured? I suggest that it cannot be.

    • It is fine to try to beat the GCM operators at their own game. Fun, a challenge, a debate.

      Do none of you know anything about radiation?

      “We also know the annual average increase in CO2 forcing is about O.O35 W/m2.”

      We know no such thing! Bring this number back, show me that it is not, instead, 0.00 W/m2!

      From where came this number? Not from First Principles. This is the crux of the matter, no one has established the physics of any CO2 forcing whatsoever, all from unscientific assumptions that All of the Warming from some year, 1750, 1880, 1979, was caused by CO2!!!

      Freaking kidding me, billions and billions of dollars on fake science?

      What are you people doing?????

      • I think there is first principle support for CO2 forcing (e.g.: http://climateknowledge.org/figures/Rood_Climate_Change_AOSS480_Documents/Ramanathan_Coakley_Radiative_Convection_RevGeophys_%201978.pdf).

        I think there are huge problems with the modeling (but I’m not sure the physics behind the hypothesized CO2-forcing this is one of them):

        The uncertainty issue (across a multitude of inputs, not just LWCF), the fact that temperature anomalies may be modeled by a relatively simple linear relation to GHG forcing (implicating a modeling bias that mutes the actual complex non-linear nature of the climate to ensure temperature goes up with CO2), the lack of fundamental understanding of the character of equilibrium states of the climate across glacial and inter-glacial periods, among a myriad of other issues, along with uncertainties associated with GHG-forcing estimates, but not the physics of GHG-forcing.

        • “The absorption coefficient (K) is independent of wavelength”??? It most certainly is not, and they do not derive it, just pull it from obscurity.

          Please, someone, anyone, tell me where this number 0.035 W/m2 comes from. I suggest that there is no evidence showing that, instead, it is not O.000 W/m2.

          • Is it so hard to search the paper, Michael?

            From the abstract: “This annual (+/-)4 Wm^2 simulation uncertainty is
            (+/-)114 x larger than the annual average ~0.035 Wm^2 change in tropospheric
            thermal energy flux produced by increasing GHG forcing since 1979.
            .

            Page 9: “the average annual ~0.035 Wm^-2 year^-1 increase in greenhouse gas forcing since 1979

            The average since 1979. Is that really so hard?

            I posted some more recent numbers here Michael.

            It’s no mystery. It just takes checking the paper.

      • Pat Frank,

        Non-responsive. You are a physicist. Someone else must have written down this number and you have posted it here, not questioning its derivation.

        Once again, I suggest that it could just as easily be O.OOO W/m2.

        “The average since 1979. Is that really so hard?” Are you giving me a simple statement of fact? Facts according to whom, derived how?

        This is the basis of your entire contention, but it comes from just an assumption that all the warming since 1979, or some other year, is due to CO2.

        This is bizarre now, a simple question, prove to us that this number has been established from First Principles.

        I want to shoot down the entire basis of CAGW from CO2. Seems like you do too. There is no obvious physical basis. Have you considered this, or are you just debating the GCM’s?

        Wow…

        • Alright, so I followed your link, from the EPA?

          You are not a physicist at all. No physicist would just write down a number without being able to back it up, as I was taught in engineering school.

          Back it up.

          This is the entire basis of your paper, which, if true, could be a huge victory, help stop this gigantic fraud. If you cannot back it up, you got nothing, just debating statistics, another soft science.

          • You understand that what the value actually is doesn’t really matter don’t you? Dr Frank took the value used in the climate models to develop his emulation and to show that the uncertainty overwhelms the ability of the models to predict anything. He wasn’t trying to validate the value, he was trying to calculate the uncertainty. When you are playing away from home then you use the home team’s ball so to speak.

          • It’s an empirical number, Michael. Do you understand what that means?

            I didn’t “write it down.” It’s a per-year average of forcing increase, since 1979, taken from known values. Is that too hard for you to understand?

            Other people here have had trouble understanding per-year averages. And now you, too. Maybe it’s contagious. I hope not.

  21. Using a mathematical model of a physical process is a valid endeavor when the model matches reality to the degree of accuracy required for practical uses. A model can be created with any combination of inductive and deductive reasoning, but any model can only be validated by comparing its predictions against careful observations of the physical world. A model’s inputs and outputs are data taken and compared to reality. A model’s mathematics follow processes that can be observed through physical experimentation.

    A good example of such a model is used to explain the workings of a semiconductor transistor. Without such a proven-useful model all of our complex electronic devices would have been impossible to develop.

    GCM’s fail the basic premises of modeling. There practically can never be a complete input data set due to their gridded nature. Their architecture includes processes that have not been observed in nature. Their outputs do not match observed reality. They have no practical use for explaining the climate or predicting how it will change. They are simply devices to give a sciency feel to political propaganda.

    Explaining why they are failures is interesting to a point. But after decades of valiant attempts to explain their obvious shortcomings they still are in widespread use (and misuse). This fact only bolsters the argument that they are not for scientific uses but political tools.

    • They are doing tremendous political damage – a class already skittish with any physics, will become anti-scientific. To a modern industrial civilization, that is Aztec poison.

    • An Engineer‟s Critique of Global Warming „Science‟
      Questioning the CAGW* theory
      http://rps3.com/Files/AGW/EngrCritique.AGW-Science.v4.3.pdf

      Using Computer Models
 to Predict Future Climate Changes
      Engineers and Scientists know that you cannot merely extrapolate data that are scattered due to chaotic effects. So, scientists propose a theory, model it to predict and then turn the dials to match the model to the historic data. They then use the model to predict the future.
      A big problem with the Scientist – he falls in love with the theory. If new data does not fit his prediction, he refuses to drop the theory, he just continues to tweak the dials. Instead, an Engineer looks for another theory, or refuses to predict – Hey, his decisions have consequences.
      The lesson here is one that applies to risk management

  22. This article is a Tour de force. It is a single broadside that reduces the enemy’s pretty ship, bristling with guns and fluttering sails, to a gutted, burning, sinking hull. Hopefully we see people abandoning ship very soon.

    • Matthew, It shows how much money and time can be spent to replace something that can be done with a linear extrapolation or a ruler and graph paper. GCM’s are models built by a U.N. committee of scientists and they look exactly like that.

        • Nick, We can all use computers to make pretty videos with R. The numbers used to frighten small children and lefty loonies are always the increase in average surface temperature and computing that only takes a ruler and graph paper, as Dr. Frank and others have shown. The rest is smoke and mirrors and billions of dollars down academic drains.

        • Sure you can! You establish the alleged (and disproven) linear relationship of CO2 and temperature with your ruler (or hand held calculator), and then you build a Rube Goldberg machine to entertain the dimwitted.

          Voila!

        • Nick,

          Fair enough, sir. Is there any particular reference year for this NOAA data set such that it could be compared to one of their POES visualizations? Thank you.

          • No. As stated, GCMs don’t do weather, at least not years in advanced. And that includes ENSO. They can do all the mechanics of ENSO, and they happen with about the right frequency. But they aren’t synchronised to Earth ENSO. We can’t predict ENSO on Earth, and there is no reason to expect that GCMs could.

  23. There seems to be no consistency with units here. Previous versions of the paper have asserted the rather bizarre proposition that the LCWF rmse of 4 W/m2 given by Lauer really has the units ±4 Wm⁻² year⁻¹ model⁻¹, because it is being averaged over years and models. And that was an essential part of the theory; carrying the year⁻¹ meant that it could be compounded over years. If it was month⁻¹ it would be compounded over months, with a much different result. The treatment in the paper was very inconsistent, but the insistence on the unit was spelt out:

    “On conversion of the above CMIP cloud root-mean-squared error (RMSE) as ±(cloud-cover unit) year⁻¹ model⁻¹ into a longwave cloud-forcing uncertainty statistic, the global LWCF calibration RMSE becomes ±Wm⁻² year⁻¹ model⁻¹ The CMIP5 models were reported to produce an annual average LWCF RMSE = ± 4 Wm⁻² year⁻¹ model⁻¹, relative to the observational cloud standard (Lauer and Hamilton, 2013).”
    The unit is even claimed to come from the source, Lauer and Hamilton, which is not true; they said it was 4 Wm⁻².

    But now it has all gone away again. ±4 W/m² everywhere, even in the heading.

    • Nick, The post contains:

      We know from Lauer and Hamilton, 2013 that the annual average ±12.1% error in CMIP5 simulated cloud fraction (CF) produces an annual average ±4 W/m^2 error in long wave cloud forcing (LWCF).

      How is this unclear?

      From Lauer and Hamilton, page 3833:

      These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means.

      The errors are annual means over twenty-year datasets and expressed as percentages. I don’t see the problem.

      • “I don’t see the problem.”
        Inconsistency with units is a very big problem in anything scientific. Just what are the units of this datum from Lauer and Hamilton? They seem to be whatever Pat wants them to be at any point in time. It isn’t just carelessness; he at times explains why they should be so, and then drops that. Mostly, despite what he says, he does just refer to ± 4 Wm⁻², as here.

        It is important for the arithmetic, As he says in introducing Eq 6:
        “The annual average CMIP5 LWCF calibration uncertainty, ± 4 Wm⁻² year⁻¹, has the appropriate dimension to condition a projected air temperature emulated in annual time-steps.”
        The converse of that is that ± 4 Wm⁻² has the inappropriate dimension. But the key is year⁻¹. That permits, or is intended to permit, annual compounding. That is a big part of the arithmetic. I objected that If you compounded monthly you’d get a much bigger number. Pat said, no, then (paraphrasing) the ± 4 Wm⁻² year⁻¹ would be ± 4/12 Wm⁻² month⁻¹. You can only do that with the extra time umit (and even then because of adding in quadrature it doesn’t work).

        • Lauer and Hamilton used annual averages, which makes sense. They tried seasonal averages (page 3839), but settled on annual. I agree units are important, I just don’t see any problem with what Frank did. Your argument is valid, just doesn’t apply here as far as I can see.

          • It’s a very big problem. There is one datum point, and there seems to be no agreement on what its units are. And Pat claims he is the only one who knows about dimensional analysis. And that the units, whatever they are, are very important.

          • You overlooked the annual index “i” Nick.

            No one else has done.

            Andy May is right. Lauer and Hamilton used annual averages. They’re referenced throughout the paper. Where per year is not stated, it is directly implied.

            You’re playing the same game here as you tried when claiming that rmse has only positive roots. You’re taking a convention and abusing its meaning.

          • “Where per year is not stated, it is directly implied.”
            Why on earth, when you are beating the drum about how only physical scientists understand proper treatment of measurement, can’t you state units properly? No proper scientists “imply” units. They state them carefully and directly.

            So what is implied, then. What are the actual units of this 4 Wm⁻²? If you put ±4 Wm⁻² year⁻¹ into eq 5.2, you get u as K/year, and goodness knows what that could mean.

        • So much like the splicing of data from two completely different data sources? I seem to recall someone doing some sort of trick like that in a publication or two…

        • This is a non-issue.

          Average natural background radiation dose to humans in the U.S. is about 3.1 mSv/yr.

          I can also say this: The annual average radiation dose to humans in the U.S. is 3.1 mSv.

          Those statements are equivalent.

    • I replied to that complaint under your DE post, Nick.

      Look at eqns. 5.1 and 5.1. They are annual steps.

      Subscript “i” is a yearly index.

  24. A well written piece Dr Frank.

    It seems so difficult to argue against your logic, having codified what many engineers have thought and said over the years. But I suspect the usual candidates will have a go at you.

    Be brave, the logic is nearly complete.

  25. “That necessary linearity means that Nick Stokes’ entire huge variety of DEs would merely be a set of unnecessarily complex examples validating the linear emulation equation in my paper.”
    What is the basis of “unnecessarily”? You are using your very simple model which produces near linear dependence of global average surface temperature, to replace a GCM, which is certainly very complex, and saying that the simple model can be taken to emulate the error propagation of the GCM, even though it has none of the physics of conservation of mass, momentum and energy which actually guides and limits the propagation of error.

    As a scientific claim, you actually have the obligation to demonstrate that the error behaviour is the same, it you are going to analyse it in place of the GCM. Not just wave hands about how close emulation of a single variable is.

    Forgotten as always in this is that GCMs are not just devices to predict global average surface temperature. They cover a huge number of variables, including atmospheric temperature at all levels. Matching just surface temperature over a period in no way establishes that the models are equivalent. This is obvious when Dr Spencer points out that this silly error growth claim would be limited by the requirements of TOA balance. Well, the Earth has such a balance, and so do the GCM’s, but there is nothing in Pat Frank’s toy model about it.

    The point of my proof that you can match a prescribed solution to any kind of error behaviour just reinforces the point that you have in no way established the requirements for analysing the toy in place of the real.

    • Nice try Nick. But, the point is that the toy does just as good a job estimating global mean surface temperature as the GCM’s. The fact the the GCM’s attempt (and fail) to produce a matrix of temperatures throughout the Troposphere is irrelevant, all anyone talks about is surface temperature, which is sad I agree. Besides, John Christy has shown that the models are inept at estimating the vertical temperature gradient.

      Creating yet another red herring does not hide the problems with model validation, or lack thereof.

      • “But, the point is that the toy does just as good a job estimate global mean surface temperature as the GCM’s. “
        So would a curve fit, as the toy effectively is. But it tells you nothing about error propagation in the GCM.

        And you can’t isolate single variables – in the GCM they all interact. There are all sorts of effects in the GCM which would ensure that the temperature can’t just rise 18°C, as Pat’s toy model (with no physics) can. Dr Spencer’s TOA balance is just one.

        • And you can’t isolate single variables – in the GCM they all interact.

          Which is why all GCM’s are pure fantasy, and why they are not proof of anything, except bias.

        • Nick,
          I agree with you (and Spencer) to a point. Dr. Frank’s work does not invalidate the GCM’s, nor does it explain the propagated errors in the models. The climate data we have does that quite well.
          What his work does show, is that what the models were designed to do, compute man’s influence on climate, cannot be accomplished with them, because they cannot resolve the low-level cloud cover accurately enough. The error due to changing cloud cover swamps what they are trying to measure and is unknown. This has been known for a long time. Spencer, Lindzen, and others have written about it before. I think that Frank’s work compliments the others and is very helpful.
          I realize you (and perhaps Spencer as well) are trying to throw irrelevant stuff to mask his main conclusion, similar to others efforts to trivialize the work Spencer and Christy did with satellite temperature measurements or the work that Lindzen did on tropical cloud cover ECS estimates, but it won’t work. The underlying problem with the climate models is they are not accurate enough to estimate man’s contribution to climate change, and they may never be.

        • “And you can’t isolate single variables – in the GCM they all interact. There are all sorts of effects in the GCM which would ensure that the temperature can’t just rise 18°C, as Pat’s toy model (with no physics) can. Dr Spencer’s TOA balance is just one”

          Are you saying that if you don’t understand how one variable works, you should add many many more variables that you also don’t understand and…Magic ?

          It sounds to me like you’re admitting the GCMs have an a priori conclusion (reasonable looking predictions that show an impact of co2 forcing). Of course you can curve fit enough variables to grt what you want. Does it model the real world, though?

        • Nick Stokes: And you can’t isolate single variables – in the GCM they all interact. There are all sorts of effects in the GCM which would ensure that the temperature can’t just rise 18°C, as Pat’s toy model (with no physics) can. Dr Spencer’s TOA balance is just one.

          Actually, Pat Frank’s analysis shows that you can isolate a single variable. Your point about there being many variables whose uncertainties ought to be estimated concurrently implies that Pat Frank has achieved an approximate lower bound on the estimation uncertainty.

          The sum and substance of your commentaries is just that: the actual model uncertainty resulting from uncertainties in the parameter values, is greater than his estimate.

          • “Actually, Pat Frank’s analysis shows that you can isolate a single variable.”
            Well, it shows that he did it. But not that it makes any sense. Dr Spencer’s point was, in a way, that there is a Le Chatelier principle at work. If something changes, something else varies to counter the change. The reason is the overall effect of the conservation principles at work. Roy cited TOA balance as one.

            But Pat Frank’s toy does not have any other variables that could change, or any conservation principles that would require them to.

          • Nick,

            “If something changes, something else varies to counter the change. The reason is the overall effect of the conservation principles at work. Roy cited TOA balance as one.

            But Pat Frank’s toy does not have any other variables that could change, or any conservation principles that would require them to.”

            Why would Pat Frank’s emulation *need* any other variables if his output matches the output of the models? This just sounds like jealousy rearing its ugly head.

            Conservation principles do not cancel out uncertainty. Trying to say that it does is really nothing more than an excuse being used to justify a position.

          • Roy Spencer’s analysis confuses a calibration error statistic with an energy flux.

            His argument has no critical impact, or import for that matter.

            Your comment, Nick, shows you don’t understand that really obvious distinction, either.

            Either that, or you’re just opportunistically exploiting Roy’s mistake for polemical advantage.

          • A question for Pat Frank: Would it be incorrect to think of an uncertainty value as a type of metadata, attached to and describing the result? The result addresses the question posed, while the uncertainty value speaks to the (quality of the) result.

          • Matthew Schilling
            Since Pat hasn’t responded, I’ll presume to weigh in. I think that metadata is an apt description for uncertainty.

      • Nick, “There are all sorts of effects in the GCM which would ensure that the temperature can’t just rise 18°C, as Pat’s toy model (with no physics) can. (my bold)”

        Here we go again. Now even Nick Stokes thinks that an uncertainty in temperature is a physical temperature.

        That’s right up there with thinking a calibration statistic is an energy flux.

        You qualify to be a climate modeler, Nick. Your level of incompetence has raised you up into that select group

        • Pat, for what it’s worth, but I as a physicist am shocked by the sheer incompetence that seems to be present in the climate community. The method you are using is absolutely standard, every physics undergraduate is supposed to understand it – and usually does without any effort. The mistaken beliefs about error propagation that the climate guys show in all their comments are downright ridiculous. Kudos to you for your patience explaining again and again the difference between error and uncertainty. Really hard concepts to grasp for certain people.
          (I’m usually much more modest when commenting, but when people trumpet BS with such a conviction, then I can’t hold myself back)

          • Thank-you so much for the breath of fresh air, nick.

            It means a huge lot to me to get support in public from a bona fide physicist.

            As you can imagine, climate modelers are up in arms.

            Even climate scientists skeptical of AGW are going far afield. You may have seen Roy Spencer’s critique, equating an uncertainty statistic with an energy flux, here and here, as well as on his own site..

            Others have equated an uncertainty in temperature with a physical temperature.

            If you wouldn’t mind, it would be of huge help if you might support my analysis elsewhere to others as the occasion permits.

            Thanks again for stepping out. 🙂

    • Nick,

      Is TOA balance a constraint on the GCMs? Wouldn’t there be any number of non-unique solutions to the models if it wasn’t?

        • Thank you Nick. It sounded like a constraint to me. For this reason, I was puzzled by Dr. Spencer’s initial objection to Dr. Frank’s paper on the basis that GCMs achieve TOA balance. PS – Given your modeling expertise with DEs, did you ever do any work in quantitative finance?

    • Soden and Held’s postulated water vapor feedback mechanism is central to the theory that additional warming at the earth’s surface caused by adding CO2 to the atmosphere can be amplified, over time, from the +1 to +1.5C direct effect of CO2 into as much as +6C of total warming. (Citing Steven Mosher’s opinion that +6C is credible as an upper bound for feedback driven amplified warming.)

      However, it is impossible at the current state of science to directly observe this postulated mechanism operating in real time inside of the earth’s climate system. The mechanism’s presence must be inferred from other kinds of observations. One important source of the ‘observations’ used to characterize and quantify the Soden-Held feedback mechanism is the output generated from the IPCC’s climate models, a.k.a. the GCM’s.

      See this comment from Nick Stokes above:

      https://wattsupwiththat.com/2019/09/19/emulation-4-w-m-long-wave-cloud-forcing-error-and-meaning/#comment-2799230

      In the above comment, Nick Stokes says, “The point of my proof that you can match a prescribed solution to any kind of error behaviour just reinforces the point that you have in no way established the requirements for analysing the toy in place of the real.”

      In his comment, Nick Stokes labels Pat Frank’s GCM output emulation equation as ‘the toy’ and the GCM’s the equation emulates as ‘the real.’

      Referring to Soden and Held’s use of output from the GCM’s as observational data which supports their theory, it is perfectly appropriate to extend Nick Stoke’s line of argument by labeling the GCM’s as ‘the toys’ and the earth’s climate system itself as ‘the real.’

      With this as background, I make this request of Nick Stokes and Roy Spencer:

      Please post a list of requirements for producing and analyzing the outputs of GCM’s being used as substitutes for observations made directly within the earth’s real climate system. In addition, please include a glossary of scientific and technical terms which defines and clarifies the exact meaning and application of those terms, as these are being employed in your list of requirements.

      Thanks in advance.

      • Right at the beginning of this debate I reminded the protagonists, particularly Nick, to be rigorous about the various constructs they were discussing. As you say there is the real world, the current set of GCMs, the linear emulator of that set of models. Add to that is some future set(s) of potentially improved GCMs and their potential emulators.

        The questions being discussed relate to the way each perform on temp projection in and out of sample, and conclusions can only be drawn within the limitations of that framework.

        I’ve decided in the end that rigour is to be avoided in favour of the rhetoric that that lack of it allows.

      • “Please post a list of requirements for producing and analyzing the outputs of GCM’s being used as substitutes for observations made directly within the earth’s real climate system. “
        They aren’t a substitute for observing future states. We just don’t know how to do that yet.

        But in fact they are used to enhance observation of the present. This is the reanalysis of data from numerical weather forecasting, which is really just re-running the processes of the forecasts themselves. It does build on the knowledge of the earth that we acquire from observation. And it is using programs from the GCM family.

        • Within the context of your ongoing debate with Pat Frank, your response indicates you have no intention of addressing what is a perfectly reasonable request.

    • Stokes,
      You said, “They cover a huge number of variables, including atmospheric temperature at all levels.” And they also cover precipitation. They are notorious for doing a poor job of predicting precipitation at the regional level, with different models getting opposite results. This is further evidence that the models are unfit for purpose. So, what if they include “atmospheric temperature at all levels?” They may be “reasonable’ in the sense that they are physically possible, but are they “reliable?”

    • Nick Stokes –> You’re missing the trees for the forest. If I drove 100 miles and used 10 gallons of gas I could calculate my average miles/gallon a couple of ways. I could assume a simple (KISS) linear model and simply divide 100 by 10. Or, I could go off and develop all kinds of equations that simulate aerodynamics, tire friction, ICE performance (as you mentioned), etc., and end up writing Global Mile per Gallon Model (GMGM). Which one do you think would give me the better answer with the least uncertainty?

    • Nick Stokes: You are using your very simple model which produces near linear dependence of global average surface temperature,

      Pat Frank does not have a simple model of global average surface temperatures, he has a simple model of GCM-forecast global average surface temperatures. He also does not have a model of any other GCM forecast, only global mean temperature. He does not claim to show that GCM forecasts are unreliable in all things, such as global annual rainfall, only that they are unreliable on their most cited forecasts, global mean temperature. If they are truly reliable for their other forecasts, that will be remarkable. He has provided a template for how the unreliability of other forecasts might be estimated: start by regressing rainfall forecasts against forcing inputs, and go from there; if it’s another monotonic function with a tight fit, we’re golden.

      In the mean time, know that the GCM forecasts of mean global temp are unreliable.

    • Irrelevant, Nick. My analysis is about air temperature projections, and nothing else.

      GCMs are just linear extrapolation machines. The uncertainty analysis follows directly from that.

      The emulation equation shows the same sensitivity to GHG forcing as any GCM. Embarrassing, isn’t it.

      All the rest of your caviling about the internal complexity of GCMs is just so much raising of the dust. Their output is simple. And that’s the rub.

      • “The emulation equation shows the same sensitivity to GHG forcing as any GCM.”

        This is a very important point and one that I would expect GCM modelers to want to dig into. This result certainly opens the possibility that the GCMs are all subject to significant modeler’s bias that should be analyzed and run to ground.

        It is almost unbelievable that these complex models would yield a linear relationship between GHG forcing and temperature. Clearly the climate does not behave that way, indicating that the GCMs are not representing reality.

        That so many cannot see this point is astounding to me.

        Great job Dr. Frank

  26. When CERN ran the CLOUD test with their particle beam and aerosols, the actually modelled the formation of Cloud Condensation Nuclei and got it wrong! Svensmark pointed that out in early 2018.
    Since the GCM’s cannot handle hurricanes, with the joker card they lack resolution, they have no hope of handling Forbush decreases and CME’s.
    So the reason in this case is not resolution, just lack of physics.

    It is very refreshing to see resolution, uncertainty, error all clearly expressed.

    Just a side note – Boeing engineering was forced to change the engines because of CO2, and used (outsourced) software to compensate, which failed. Someone decided physics and engineering could be sidelined. I just wonder if that software was ever run through such an uncertainty and error analysis?

  27. Dr. Franks,
    An excellent ‘plain english’ explanation of your published paper and a succinct rebuttal of Nick Stokes differential equations dissembling. You logically constrained Stokes to a black box ‘time out’.
    Thank You!

  28. There is no sensitivity to CO2. If there was then the specific heat table would say you must include forcing equation doing air or CO2 and infrared is involved. But it doesn’t.

  29. Thank you, Dr. Frank. Your analyses are logical, and further our understanding of the credibility of “Models”, which have become the underpinnings of planned political movements. If we can only get the policy makers, the general public, and especially the youth to understand that the huge investments being contemplated are merely building castles in the sand. I’ve seen too many of these half-baked good intentions in my lifetime. (“The Great Society”, Vietnam, Iraq, etc.) Again, thank you for the breath of fresh air you are providing!

    • Happy it worked out, MDBill.

      One really beneficial outcome is to remove the despair that is being pounded into young people.

      There’s every reason to think the future will be better, not doom-laden. That word should get out.

        • I think Nick still believes that uncertainty is the same as a random error and since random errors tend toward the central limit theory that uncertainty does the same. Uncertainty, however, is not random!

      • Nick Stokes: That’s the problem with random walk.

        You have not, as far as I have read, explained why you think Pat Frank’s procedure is a random walk. Perhaps because the uncertainty is represented as the standard deviation of a probability distribution you think the parameter has a different randomly sampled value each year. That would produce a random walk. That is not what he did.

        • “explained why you think Pat Frank’s procedure is a random walk”
          Well, here is what he says on p 10:
          “The final change in projected air temperature is just a linear sum of the linear projections of intermediate temperature changes. Following from equation 4, the uncertainty “u” in a sum is just the root-sum-square of the uncertainties in the variables summed together, i.e., for c = a + b + d + … + z, then the uncertainty in c is ±u_c =√(u²_a+u²_b+…+u²_z) (Bevington and Robinson, 2003). The linearity that completely describes air temperature projections justifies the linear propagation of error. Thus, the uncertainty in a final projected air temperature is the root-sum-square of the uncertainties in the summed intermediate air temperatures.”

          Or look at Eq 6. The final uncertainty is the sqrt sum of variances (which are all the same). The expected value of the sum. How is that not a random walk?

          And note that despite the generality of Eq 3 and 4, he is assuming independence here, though doesn’t say so. No correlation matrix is used.

          • Nick Stokes: And note that despite the generality of Eq 3 and 4, he is assuming independence here, though doesn’t say so. No correlation matrix is used.

            That part was explained already: the correlation is used in computing the covariance.

          • “Uncertainty, however, is not random!”
            “It’s about uncertainty.”

            Uncertainty has a variance (Eq 3). And its variance compounds by addition through n steps (eq 4). That is exactly how a random walk works. Just below Eq 4:

            “Thus, the uncertainty in a final projected air temperature is the root-sum-square of the uncertainties in the summed intermediate air temperatures.”

            That is exactly a random walk.

          • Nick,

            “Thus, the uncertainty in a final projected air temperature is the root-sum-square of the uncertainties in the summed intermediate air temperatures.”

            That is exactly a random walk.”

            No, it is not a random walk. The uncertainties are no random in nature, therefore their sum cannot be a random walk.

          • “the correlation is used in computing the covariance.”
            Where? Where did the data come from? What numbers were used?

            As far as I can see, the arithmetic of Eqs 5 and 6 is fully spelt out, with a hazy fog of units. The numbers are given. None relates to correlation. No term for correlation or covariance appears.

  30. “In the paper, GCMs are treated as a black box.”
    As they should be because programs of that size are not easily examined and understood by those who are not paid to do so, and can expend the necessary time to step through the Fortran code. There is an old saying that “All non-trivial computer programs have bugs.” Parallel processing programs of the size of GCMs, rife with numerically-approximated partial differential equations, certainly qualify as being “non-trivial.”

    • “As they should be because programs of that size are not easily examined and understood”
      So you write a paper about how you don’t understand GCM’s, so you’ll analyse something else?

  31. Pat, your work and the responses in critical articles and thoughtful comments here has been the best example of science at work at WUWT in a long time. Today’s response from you has clarified a complex issue. Many thanks. You have also inspired an (old) idea and a way forward in development of a more robust theory in your comments below:

    “Does cloud cover increase with CO₂ forcing? Does it decrease? Do cloud types change? Do they remain the same?

    What happens to tropical thunderstorms? Do they become more intense, less intense, or what? Does precipitation increase, or decrease?”

    I think the answer to these questions is calling out loud and clear. Our fixation on satellite and computer tech has blinded us to the importance of old fashioned detailed fieldwork for getting at I am a geologist who has sweated out mapping geology on foot, canoe, Landrover, helicopter etc. on geological survey and mining exploration work in Canada, Africa, US and Europe.

    We know the delta CO2 well enough. We need to make millions of observations in the field along with help from our tech and record local (high resolution) changes in temperatures, pressures, humidity, wind speeds and direrctions, details on development and physiology of thunderstorms. A new generation of buoys that can see the sky and record all this would also be useful.

    Doubting that such a task could be accomplished? Here is a Geological Map of Canada that is a compilation of millions of observations, records and interpretations (a modest number of pixels of this is my work, plus ~ 35, 000km^2 of Nigeria, etc.)

    https://geoscan.nrcan.gc.ca/starweb/geoscan/servlet.starweb?path=geoscan/fulle.web&search1=R=208175

    Scroll down a page, tap the thumbnail image and expand with your fingers.

    • Your comment, Gary, that, “I think the answer to these questions is calling out loud and clear. Our fixation on satellite and computer tech has blinded us to the importance of old fashioned detailed fieldwork for getting at I am a geologist who has sweated out mapping geology on foot, canoe, Landrover, helicopter etc. on geological survey and mining exploration work in Canada, Africa, US and Europe.” …

      expresses something I’ve also thought for a long time.

      Climate modeling has abandoned the reductionist approach to science. They try to leap to a general theory, without having done all the gritty detail work of finding out how all the parts work.

      Their enterprise is doomed to failure, exactly for that reason.

      It won’t matter how well they parse their differential equations, how finely they grid their models, or how many and powerful are their parallel processors. They have skipped all the hard-scrabble work of finding out how the parts of the climate system operate and how they couple.

      Each bit of that is the work of a lifetime and brings little glory. It certainly disallows grand schemes and pronouncements. Perhaps that explains their avoidance.

  32. I attempt a first-order analogy of the earth’s temperature with the water-level of a hypothetical lake. This causes me to question both the GCMs and Dr. Frank’s method of estimating their error bounds:

    Suppose it is observed that the water level of some lake varies up and down slightly from year to year, but over numerous years has a long-term trend of rising. We want to determine the cause. Assume there are both natural and human contributors to the water entering the lake. The natural “forcing” of the lake’s level consists of streams that carry rainwater to the lake, and the human forcing is from nearby houses that empty some of their waste-water into the lake. Some claim that it is the waste-water from the houses that is causing most of the long-term rise in the lake. This hypothesis is based on a model that is thought to accurately estimate the increasing amount of water contributed yearly by the houses, as more developments are built in the vicinity. However, the measurement of the other contributor, the water that flows naturally into the lake, is not very good; the uncertainty in that water flow is 100 times greater than the modeled amount of water from the houses. Presumably, in such a case, one could not conclude with any confidence that it is the human ‘forcing’ that is causing the bulk of the rise in the lake.

    Similarly, given the uncertainty in the contribution of natural forcings like clouds on earth’s temperature, the GCMs give us little or no confidence that the source of the warming is mainly human CO2 forcing.

    We could remove the effects of clouds in the GCMs if we knew that their influence on world temperature was constant from one year to the next, just as in the analogy we could remove the effects of natural sources of water on the level of the lake if we knew that the streams contribute the same amount each year. But, presumably, we don’t have good knowledge of the variability of cloud forcings from one year to the next, and I think this is the problem with Dr. Frank’s error calculation. To calculate the error in the GCM predictions, what is needed is the error in the variability of the cloud effects from one year to the next, not the error in their absolute measurement. Perhaps this is what Dr. Spencer was getting at in his critique of Dr. Frank’s method. To analogize once gain, if my height can only be measured to the nearest meter as I grow, there is an uncertainty of one meter in my actual, absolute height at the time of measurement. But this provides no reasonable basis for treating the error in my predicted height as cumulatively increasing by many meters as years go by.

    • Good God, David L,

      That analogy so clouds my understanding.

      I have never understood the approach of creating a mind-boggling analogy to help clarify an already mind-boggling argument. It’s as if you substitute one complexity for another and ask us to dissect the flaws or attributes of an entirely separate thing, in addition to trying to understand what is already hard enough to understand.

      GENERAL REQUEST: Stop with the convoluted analogies that only confuse the issue more.

  33. I figured it out, maybe. If an annual CO2-increase is 2,6 ppm, then the annual RF-value for CO2 is 0.035 W/m^2. It cannot be called an average value but it is in the high end of the present CO2 annual growth variations. A long term annual CO2 growth rate has been something like 2.2 ppm.

    • Why is it such a big mystery, Antero? I described the method as the average since 1979.

      The forcings I used were 1979-2013, calculated using the equations of Myhre, 1998:

      In 1979, excess CO2 forcing was 0.653 W/m^2; CO2 + N2O + CH4 forcing was 1.133 W/m^2.

      In 2013 they were 1.60 and 2.44 W/m^2.

      CO2 = (1.60-0.653)/34 = 0.028 W/m^2.

      Major GHG = (2.44 – 1.13)/34 = 0.038 W/m^2

      The numbers to 2015 at the EPA page, give 0.025 W/m^2 and 0.030 W/m^2 respectively.

  34. It’s not coincidental that much of the most persuasive criticism of the climate scam has come from professionals who deal in engineering and economic analyses such as McIntyre and McKitrick. Or from scientists like Dr. Frank who seek to use experimental data to prove or disprove theoretical calculations. There’s nothing like reality, whether measured in dollars or in the failure of devices, to focus one’s mind. Consider the manufacture of any large structure, say an airplane or a ship Mass production methods require that the components of the final product be assembled in an efficient process. The tolerances of each part must be sufficiently tight that when a large number of them are put together, the resultant subassembly can still satisfy similarly tight tolerances, so that the final assembly stays within tolerance. Boeing’s attempt to farm out subassemblies of the 787 was only partially successful because manufacturing practices in some countries simply weren’t at the level needed. This traces back to WWII and aircraft production facilities like that at Willow Run where Ford produced B-24s at a rate of about one aircraft per hour. B-24s were assembled at Willow Run using about 20,000 manhours, whereas the previous methods used by Consolidated in San Diego took about 200,000 manhours. Much of those 200,000 hours were spent by craftsmen working to get all the disparate, relatively low-tolerance pieces to fit together. Construction of huge tankers and bulk carriers face the same problem as very large subassemblies are brought together. Failure to control the tolerances of the parts and pieces means that the final assembly cannot be completed without costly reworking the parts. So reality lends a hand in focusing the engineering effort. Ten percent uncertaintiess in the widths of pieces that were to be assembled into the engine room of a tanker would be highly visible and painfully obvious, even to a climate modeler.

  35. Dr Frank,

    If we have some idea of the probability distribution of the cloud forcing uncertainty, can get a probability distribution for the temperature at the end of 100 years that model gives ? Can another formula instead of the square root of sum of errors squared be used if we know more about the e distribution at each step ?

    • Stevek, how does one know the error probability distribution of a simulation of a future state? There are no observables.

      • Thank you ! That makes sense to me now and clears up my thinking. The uncertainty itself at end of 100 years must have a distribution that can be calculated if we know the distribution of all variables that go into the initial state ? You are not saying all points within the ignorance are equally likely?

        • Stevek, I’m saying that no one knows where the point should be within the uncertainty envelope.

          To be completely clear: suppose the uncertainty bound is (+/-)20 C. Suppose, also, that the physical bounds of the system requires that the solution be somewhere within (+/-)5 C.

          Then the huge uncertainty means that the model cannot say where the correct value should be, within that (+/-)5 C.

          The uncertainty is larger than the physical bounds. This means the prediction, whatever value it takes, has no physical meaning.

  36. “I want you to unite behind the science. And then I want you to take real action.”

    Swedish climate activist Greta Thunberg appeared before Congress to urge lawmakers to “listen to the scientists” and embrace global efforts to reduce carbon emissions. https://twitter.com/ABC/status/1174417222892232705

    Dr. Frank, have you received your invitation to present actual Science to the Commi….huh?….no “contrary views allowed”….”only CONSENSUS ‘Science’ is acceptable?”…..oh, well, sorry to have bothered you.

  37. I quote from chapter 20: Basic equations of general circulation models from “Atmospheric Circulation Dynamics and General Circulation Models” by Masaki Satoh”

    One of the most uncertain factors in the reliability of currently used general circulation models is the use of cumulus parameterization. Since the horizontal extent of cumulus convection is about 1 km, the effects of cumulus convection must be statistically treated in general circulation models with horizontal resolutions of about 100 km. However, it is very difficult to appropriately parameterize all the statistical effects of cumulus convection, though many kinds of cumulus parameterizations are being used in current models. As the horizontal resolution of numerical models approaches 1 km, individual clouds can be directly resolved in the models, so that it is expected that we will no longer need to use such cumulus parameterization based on statistical hypothesis. Thus, the likely horizontal resolution of next generation general circulation models is a few kilometers. We expect the use of models with 10-km resolution or less will come within the range of our computer facilities. With such finer resolution models, the assumption of hydrostatic balance is no longer acceptable. We must switch governing equation of the general circulation models from hydrostatic primitive equations to non-hydrostatic equations. As for vertical resolution, we do not have a suitable measure of its appropriateness.

    And Dr. Frank makes the following comment:
    >>
    The CMIP5 GCM annual average 12.1% error in simulated CF is the resolution lower limit. This lower limit is 121 times larger than the 0.1% resolution limit needed to model the cloud feedback due to the annual 0.035 W/m^2 of CO₂ forcing.
    <<

    So let’s talk about the grid resolutions of CMIP5 GCMs. Here is a link to a list of resolutions: https://portal.enes.org/data/enes-model-data/cmip5/resolution.

    The finest resolution is down to about 0.1875 degrees (which is probably questionable). Most of the resolutions are around 1 degree or more. A degree on a great circle is 60 nautical miles. That is more than 111 km. Even the 0.1875 degree resolution is more than 20 km. Obviously they are using parameterization to deal with cumulus convection. In other words, cumulus convection is one of the more important physics of the atmosphere, and they are making it up.

    Jim

  38. Pat,

    Stokes has previously stated, “… yes, DEs will generally have regions where they expand error, but also regions of contraction.” As I read this, it isn’t obvious or easily determined just where the expansions or contractions occur, or how to characterize them other than by running a large number of ensembles to estimate the gross impact.

    I think that an important contribution you have made is the insight of being able to emulate the more complex formulations of GCMs with a linear model. You are then able to demonstrate in a straight forward way, and certainly more economically than running a large number of ensembles, the behavior of uncertainty in the emulation. It would seem reasonable to me that if the emulation does a good job of reproducing the output of GCMs, then it should also be properly emulating the uncertainty.

  39. How about getting these same “models” to explain deep “Ice Ages” while co2 was much higher than today ? If they can’t do that then they are worthless to predict the future….

  40. Stoke… “… yes, DEs will generally have regions where they expand error, but also regions of contraction.” ?
    How could any intelligent person “assume that the positive and negative errors would cancel each other out ?

    • “cancel each other out”
      Who said that? Firstly, it isn’t positive or negative, but expanding and contracting. But more importantly I’m saying that there is a whole story out there that you just can’t leave out of error propagation. It’s what DEs do.

      In fact, I think the story is much more complicated and interesting than Pat Frank has. The majority of error components diminish because of diffusion (viscosity). Nothing of that in Pat’s model. But some grow. The end result is chaos, as is well recognised, and is true in all fluid flow. But it is a limited, manageable problem. We live in a world of chaos (molecules etc) and we manage quite well.

      • Who said errors cancel? Dr Spencer wrote this:

        “The reason is that the +/-4 W/m2 bias error in LWCF assumed by Dr. Frank is almost exactly cancelled by other biases in the climate models that make up the top-of-atmosphere global radiative balance”

        • “Dr Spencer wrote this”
          Well, it wasn’t me. But it is a different context. He is saying, not that the biases are assumed to balance at TOA, but that they are required to balance. This is an application of conservation of energy in the model, and would prevent the sort of accumulation of error that Pat is claiming. Not that it even arises; he seems to have abandoned the claim that the units of the RMSE involved are 4 Wm⁻² year⁻¹.

          • Stokes
            You really don’t understand! Only the nominal (calculated) values output at each time step can be tested for “TOA balance” or any test of reasonableness. Unless the calculations are performed in tandem with the maximum and minimum probable values, the “accumulation error” (as you call it) isn’t going to show up. That is, the way it is done, with a single value being output, the uncertainties have to be calculated separately.

            Pat is NOT claiming that the nominal value drifts with time, but rather, that the uncertainty envelope around the calculated nominal value rises more rapidly than the predicted temperature increase.

          • Clyde,
            “Unless the calculations are performed in tandem with the maximum and minimum probable values, the “accumulation error” (as you call it) isn’t going to show up. “
            I commented earlier about the tribal gullibility of sceptics, which you seem to exhibit handsomely. I noted, for example, the falling in line behind the bizarre proposition that the 4 Wm⁻² added a year⁻¹ to the units because it was averaged over a year (if it was). Folks nodded sagely, of course it must be so. Now the year⁻¹ has faded away. So, I suppose, they nod, yes was it ever different? Certainly no-one seems interested in these curious unit changes.

            And so it is here. Pat creates some weird notion of an uncertainty that goes on growing, and can’t be tested, because it would be wrong to expect to see errors in that range. “You really don’t understand!”, they say. ” the “accumulation error” (as you call it) isn’t going to show up”.

            So what is this uncertainty that is never going to show up? How can we ever be affected by it? Doesn’t sound very scientific.

          • I know you didn’t say it (though you referenced and supported Dr. Spencer’s overall take elsewhere). And, it seems like you don’t disagree with the statement and think it’s relevant to Dr. Frank’s error accumulation argument (correct?).

            Honestly, it seems to me folks are talking past each other.

            The issue isn’t that “errors” accumulate in the sense that the variance of expected model outcomes would increase. They’re engineered not to.

            The “error” of interest, and the one that does accumulate, is our confidence (lack of) the model is accurately modeling the Real World.

            Do you disagree that the value proposition of the models is that they are predictive and that they are predictive because they (purportedly) simulate reality?

          • Tommy,
            “that they are predictive because they (purportedly) simulate reality”
            They simulate the part of reality that they claim to simulate, namely climate. It is well acknowledged that they don’t predict weather, up to and including ENSO. That is another thing missing from Pat Frank’s analysis. He includes all uncertainty about weather in his inflated totals.

            That comes back to my point about chaos. It means you can’t predict certain fine scale features of the solution. But in that, it is usually reflecting reality, where those are generally unknown, because they don’t affect anything we care about. For example, CFD won’t predict the timing of shedding of vortices from a wing. It does a reasonable job of getting the frequency right, which might be important for its interaction with structural vibrations. And it does a good job of calculating the average shed kinetic energy in the vortices, which shows up in the drag.

            Those are the things that you might want to do an uncertainty analysis on. No use lumping in the uncertainty of things you never wanted to know about.

          • Nick, I appreciate your thoughtful reply, but I’m confused by this statement:

            “Those are the things that you might want to do an uncertainty analysis on. No use lumping in the uncertainty of things you never wanted to know about.”

            Isn’t the parameter Dr. Frank is isolating an input to the model at each iteration? Isn’t an input necessarily something want to know about?

            And, given that:

            “That comes back to my point about chaos. It means you can’t predict certain fine scale features of the solution”

            But, if you can’t predict the small things that are iterative inputs to your model, how can you hope to predict the larger things (climate) that depend on them?

            It seems to me that in order to remove the accumulation of uncertainty, you either have to remove Dr. Frank’s parameter of focus as from the model (with justification) or improve the modeling accuracy of it. You can’t whitewash the fact you can’t model that small bit of the puzzle by claiming you get the bigger picture correct, when the bigger picture is a composite of the littler things.

          • Nick, “This is an application of conservation of energy in the model, and would prevent the sort of accumulation of error that Pat is claiming.

            I claim no accumulation of error, Nick. I claim growth of uncertainty. You’re continually making this mistake, which is fatal to your case.

            This may help:

            Kline SJ. The Purposes of Uncertainty Analysis. Journal of Fluids Engineering. 1985;107(2):153-60. https://doi.org/10.1115/1.3242449

            The Concept of Uncertainty

            Since no measurement is perfectly accurate, means for describing inaccuracies are needed. It is now generally agreed that the appropriate concept for expressing inaccuracies is an “uncertainty” and that the value should be provided by an “uncertainty analysis.”

            An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable.

            The term “calibration experiment” is used in this paper to denote an experiment which: (i) calibrates an instrument or a thermophysical property against established standards; (ii) measures the desired output directly as a measurand so that propagation of uncertainty is unnecessary.

            The information transmitted from calibration experiments into a complete engineering experiment on engineering systems or a record experiment on engineering research needs to be in a form that can be used in appropriate propagation processes (my bold). … Uncertainty analysis is the sine qua non for record experiments and for systematic reduction of errors in experimental work.

            Uncertainty analysis is … an additional powerful cross-check and procedure for ensuring that requisite accuracy is actually obtained with minimum cost and time.

            Propagation of Uncertainties Into Results

            In calibration experiments, one measures the desired result directly. No problem of propagation of uncertainty then arises; we have the desired results in hand once we complete measurements. In nearly all other experiments, it is necessary to compute the uncertainty in the results from the estimates of uncertainty in the measurands. This computation process is called “propagation of uncertainty.”

            Let R be a result computed from n measurands x_1, … x_n„ and W denotes an uncertainty with the subscript indicating the variable. Then, in dimensional form, we obtain: (W_R = sqrt[sum over(error_i)^2]).”

            https://doi.org/10.1115/1.3242449

          • Nick, “Now the year⁻¹ has faded away.

            Wrong again, Nick. It’s indexed away. I’ve answered your querulousity several times.

          • Tommy,
            “Isn’t the parameter Dr. Frank is isolating an input to the model at each iteration? Isn’t an input necessarily something want to know about?”
            Not, it isn’t. There is a parametrisation, to which Pat wants to attach this uncertainty. It isn’t a new uncertainty at each iteration; that would give a result that would make even Pat blanch. he says, with no real basis, every year. I don’t believe that it is even an uncertainty of the global average over time.

            “But, if you can’t predict the small things that are iterative inputs to your model”
            Because many have only transient effect. Think of a pond as an analogue solver of a CFD problem. Suppose you throw a stone in to create an error. What happens?

            The stone starts up a lot of eddies. There is no net angular momentum, because that is conserved. The angular momentum quickly diffuses, and the eddies subside.
            There is a net displacement of the pond. Its level rises by a micron or so. That is the permanent effect.
            And there are ripples. These are the longest lasting transient effect. But they get damped when reflected from the shore, or if not, then by viscosity.
            And that is it, typical of what happens to initial errors. The one thing that lasts is given by an invariant, conservation of mass, which comes out as volume, since density is constant.

          • Pat
            “Nick, “Now the year⁻¹ has faded away.”
            Wrong again, Nick. It’s indexed away. “

            You’ve excelled yourself in gibberish. Units are units, They mean something. You can’t “index them away”.

          • Nick, “You can’t “index them away”.

            Let me clarify it for you, Nick. Notice the yearly index “i” is not included in the right side of eqn. 5.2. That’s for a reason.

            But let’s put it all back in for you, including the year^-1 on the (+/-)4 W/m^2.

            Right side: (+/-)[0.42 * 33K *4W/m^2 year^-1/F_0]_year_1, where “i'” is now the year 1 index.

            Cancelling through: (+/-)[0.42 * 33K *4W/m^2/F_0]_1.

            That is, we now have the contribution of uncertainty to the first year projection temperature, indexed “1.”

            For year two: (+/-)[0.42 * 33K *4W/m^2 year^-1/F_0]_year_2, and

            (+/-)[0.42 * 33K *4W/m^2/F_0]_2; index “2.”

            (+/-)[0.42 * 33K *4W/m^2 year/F_0]_n; index “n.”

            = (+/-)u_i

            And those are what go on into eqn. 6, with units K.

            You may not get it, Nick, but every scientist and engineer here will do.

          • Pat,
            “but every scientist and engineer here will do”
            Every scientist and engineer understands the importance of being clear and consistent about units. Not just making it up as you go along.

            This story on indexing is just nuttier. To bring it back to Eq 5.1 (5.2 is just algebraically separated), you have a term
            F₀ + ΔFᵢ ±4Wm⁻²
            and now you want to say that it should be
            F₀ + ΔFᵢ ±4Wm⁻²year⁻¹*yearᵢ
            “i” is an index for year. But year isn’t indexed. year₁ isn’t different to year₂; years are all the same (as time dimension). Just year.

            So now you want to say that the units of 4Wm⁻² are actually 4Wm⁻²year⁻¹, but whenever you want to use it, you have to multiply by year. Well, totally weird, but can probably be made to work. But then what came of the statement just before Eq 6:
            “The annual average CMIP5 LWCF calibration uncertainty, ±4Wm⁻²year⁻¹, has the appropriate dimension to condition a projected air temperature emulated in annual time-steps.”
            What is the point of modifying Lauer’s dimension for the quantity, saying that that is the “appropriate dimension”, and then saying you have to convert back to Lauer’s dimension before using it?

          • Nick, “… but can probably be made to work.

            Good. You’ve finally conceded that you’ve been wrong all along. You didn’t pay attention to the indexing, did you, so intent were you on finding a way to kick up some dust.

            What is the point of modifying Lauer’s dimension for the quantity, saying that that is the “appropriate dimension”,

            Right. There you go again claiming that an annual average is not a per year value. Really great math, Nick. And you call me nutty. Quite a projection.

            … and then saying you have to convert back to Lauer’s dimension before using it?

            The rmse of Lauer and Hamilton’s is an annual average, making it per year, no matter how many times you deny it. Per year, Nick.

            I changed nothing. I followed the units throughout.

            You just missed it in your fog of denial. And now you’re, ‘Look! A squirrel!’ hoping that no one notices.

      • Stokes,

        It has been a long time since someone has referred to me as “handsome.” So, thank you. 🙂

        Now, pleasantries aside, on to your usual sophistry. The graph on the right side of the original illustration (panel b) has a red line that is not far from horizontal. That is the nominal prediction of temperatures, based on iterative calculations. It does not provide any information on the uncertainty of those nominal values. Overlaying that is a lime-green ‘paraboloid’ opening to the right. That is the envelope of uncertainty, and is calculated separately from the nominal values, again by an iterative process. The justification for the propagation of error in the forcing is stated clearly (Frank, 2019): “That is, a measure of the predictive reliability of the final state obtained by a sequentially calculated progression of precursor states is found by serially propagating known physical errors through the individual steps into the predicted final state.”

        I know that you are bright and well-educated, so other than becoming senile, the only other explanation for your obtuseness that seems to make sense if that you don’t want to understand it, perhaps because of your personal “tribalness.”

        You might find it worth your while to peruse this:
        https://www.isobudgets.com/pdf/uncertainty-guides/bipm-jcgm-100-2008-e-gum-evaluation-of-measurement-data-guide-to-the-expression-of-uncertainty-in-measurement.pdf

    • You don’t have to assume, that’s where the fudge factors come in. When the model run starts to deviate beyond what would be termed reasonable a fudge factor is applied to bring it back into line. Given the known unknowns and the unknown unknowns involved in the climate it is the ONLY way the models can run in the territory of reasonableness for so long.

      If any of these models were ever to be evaluated line by line i would be willing to bet my house in a legally binding document the above is the case.

  41. Thanks so much for all your perseverance Pat, I have been sharing your work with anyone who would listen, since I ran across it this past July. Once the public understands that all of the hype surrounding the “climate crisis” is based upon models, and that those same models are simply fictions created by those that “believe”, virtually all of this nonsense will end. And that is why your very important work is being misrepresented by those who stand to lose everything.

    • Thanks, Gator. And you’re right, literally the transfer trillions of dollars away from the middle class into the pockets of the rich, and many careers, are under threat from that paper.

  42. Pat is making another critical explaination here that is not being addressed by most of the posters:.
    How GCMs actually produce their output is completely irrelevant (black box). Whether a table lookup or the most advanced AI available..NOT relevant to his analysis. This point is being completely missed by most, and hence most of the criticism is irrelevant. Relevant criticism needs to address this key concept. To defend the skill of GCM output, defenders need to specifically address the CF uncertainty relative to CO 2 effects.

  43. Here:

    https://wattsupwiththat.com/2019/09/19/emulation-4-w-m-long-wave-cloud-forcing-error-and-meaning/#comment-2799242

    … Nick S wrote (among other things):

    That is actually the point of GCM’s. They go beyond the time when weather can be predicted, but they don’t blow up. They keep calculating perfectly reasonable weather. It isn’t a reliable forecast any more, but it has the same statistical characteristics, which is what determines climate.

    His particular wording there caught my attention, because, at first glance, it seems nonsensical.

    How can climate models that produce outcomes following reasonable weather calculations determine a reasonably REAL climate forecast? What reliable prediction of climate could be fashioned from reasonable-but-unreliable forecasts? — that makes no sense to me. The unreliability of the concomitant weather forecasts would seem to propagate into the unreliability of the climate forecasts that the models produce.

    Apples might be rotten, but they still determine the pie? A rotten apple is still a reasonable apple? A pie of rotten apples is still a reasonable pie? An unreliable forecast is a reasonable forecast?

    A sum of reasonable-but-unreliable weather calculations would seem to constitute a reasonable-but-unreliable climate forecast. I think that what we primarily seek in climate models is reliable, NOT merely “reasonable”. Reasonable alone can be a mere artifact of internal consistency. Reasonableness in climate models seems to be built in — that’s what models are — reasonable representations of something, based on the reasoning in their own structure.

    What is UNreasonable about this is that the model does not represent reality — it represents a reasonable model of reality that is unreliable — unfit to dictate real-world decisions. Using an unreliable model to dictate real-world decisions is unreasonable. It is the USE of climate models, then, that is UNREASONABLE.

    Models probably have great use for studying the complexity of climate. They might be great educational tools. But they do not seem to be great practical tools to guide the development of civilization. They are UNREASONABLE tools for helping to shape civilization.

    • Exactly!

      They’re a model of something and that something has temperature outcomes that are plausible for our reality (in addition to being politically convenient), but as far as I can tell some very smart people aren’t making the connection that this says NOTHING about what is actually going to happen in the real world.

    • It sounds to me that he is saying the results they get are what they expected, so they do not think they could be wrong. In fact they do not believe they CAN be wrong.
      After all, it was just like they expected it would be.

  44. If a model can emulate a model, and we believe the first model is correct, wouldn’t if follow the second model also be correct? And does it matter how it does it as long as the emulation is correct (limited to mean surface temperature)? After reading through all the comments to date, I’m still not convinced as to why one model (GCM or spreadsheet) is better than the other. Sure one is fancier, but I can get to the ball in either vehicle.

  45. Prof. Frank,

    Decent ‘version for dummies’ – explains and clarifies many questions that sprouted from the original article. By the way, your article is doing fairly well: “This article has more views than 99% of all Frontiers articles”. Your paper is also in the top 5% of all research outputs scored by Altmetric. Well done!

    With respect to your current post:

    “When several sources of systematic errors are identified, [uncertainty due to systematic error] beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

    “beta = sqrt[sum over(theta_S_i)^2]

    Does this operation apply to each iterative step in calculation (i.e. in the model) or we calculate beta and it is subsequently used as constant in each iteration? Advocatus diaboli may argue that this quotation from Vasquez and Whiting means we add and square uncertainties, they propagate to next iterations but remain constant and do not add up.

    • Paramenter, I’m just scientific staff, thanks Not a professor.

      If one takes the model of Vasquez and Whiting and runs it through a series of sequential step-wise calculations to determine how a system changes across time, would the uncertainty in the final result be the root-sum-square of the uncertainties in all the intermediate states?

      • If one takes the model of Vasquez and Whiting and runs it through a series of sequential step-wise calculations to determine how a system changes across time, would the uncertainty in the final result be the root-sum-square of the uncertainties in all the intermediate states?

        I would say so. Still, how simulations work for instance in the field of CFD where are used for advanced aerodynamic analysis? Results obtained are used in design of wings and other components so it simply works. Such simulations carry millions of steps each run. Small uncertainty must be associated with each step but because sheer number of steps, adding and squaring uncertainty associated with each steps causes that uncertainty quickly grows and renders results useless. But that does not happen.

        Quotation from Journal of Fluid Engineering is decent:

        An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable.

        It clearly distinguishes error from uncertainty, concept people even well-versed into stats struggling with. I reckon we need another post just with clear definitions!

        • Paramenter,

          “Small uncertainty must be associated with each step but because sheer number of steps, adding and squaring uncertainty associated with each steps causes that uncertainty quickly grows and renders results useless. But that does not happen.”

          If the uncertainty is very small then lots of steps still won’t overwhelm the result. Just how many variables in an aircraft simulation have a significant uncertainty? And these simulations *do* blow up under some conditions where the uncertainty becomes large.

          • Hey Tim,

            If the uncertainty is very small then lots of steps still won’t overwhelm the result. Just how many variables in an aircraft simulation have a significant uncertainty?

            True, many if not all, parameters are well defined due to extensive experimental research, e.g. wind tunnels. Still, if we have millions of cells (each with its own small uncertainty) and millions of steps I cannot imagine how such uncertainty does not propagate and accumulate. But – I’m not an expert in this area so it’s question rather than any solid claim.

            And these simulations *do* blow up under some conditions where the uncertainty becomes large.

            Indeed. What I heard some conditions are also very hard to simulate, e.g. higher angles of attack where simulations may render wildly different results.

          • Paramenter,

            “Still, if we have millions of cells (each with its own small uncertainty) and millions of steps I cannot imagine how such uncertainty does not propagate and accumulate.”

            Uncertainty does accumulate. The difference is that it doesn’t overwhelm the results. However, the simulations don’t give perfect answers. It’s why test pilots earn big bucks pushing the envelope of aircraft to confirm operational characteristics. Simulations can only go so far.

        • Paramenter, when people parameterize engineering models, they use the models only within the parameter calibration bounds.

          Inside those calibration bounds, observables are accurately simulated.

          Climate models projections proceed very far beyond their parameter calibration bounds. The predictive uncertainties are necessarily huge.

  46. “What happens inside the black box is irrelevant.”

    This doesn’t sound quite sound, principally. Or perhaps more like: what’s expected to come out the box is very relevant. Since the box is a simulation with it’s own sets of feedback loops and equilibria, made invisible because of the nature of the defined box, one cannot simply make statements about the kind of propagation. There’s no simulation possible (in theory even) which actually reflects the exact processing. What matters is if the emulation can be used to approach real data, if it behaves *close* enough, even while perpetually “wrong”.

    In the end it’s about about Dr. Frank’s conclusions being wrong but more how they’re closer to irrelevant to the models in use. The suggested uncertainty range and its relevant is uncertain as well. What matters in the end is if the model turns out to be useful and use it until it doesn’t.

    If someone wants to base climate policies on JUST that, well, that’s another question.

    • Aren’t climate models themselves a sort of “black box”, since they are abstractions from reality?

      And, in the sense that the models cannot “know” what some of the real quantities are, isn’t this a sort of “it doesn’t matter what goes on in the black box” state of knowledge? Yet, we value the output as reasonable.

      We value output from these black boxes (climate models), where we don’t know what is going on precisely, and yet we should question models of these black boxes (Pat’s emulation equation) that produce their (climate models’) outputs with the same inputs?

      It seems like the same sound reasoning is being applied to the climate-model emulation as is being applied to the climate models being emulated.

      Emulations of simulations. A black box mimicking a group of black boxes. Lions and tigers and bears, oh my! — it’s all part of Oz — We’rrrrrrrrrrrrrrrrrrrrrrrrr off to see the Wizard ….

  47. Pat Frank,
    Well done, but it highlights how many people struggle with the concept of error propagation and how many hoops they will jump through to avoid doing the proper math and producing error bars or some other means of showing what the uncertainty is within their calculations.
    v/r,
    Dave Riser

    • Sad if true, David.

      I recall reading a paper some long time ago, where the author expressed amazement that a group of researchers were using a slightly exotic way to calculate uncertainties in their results, they said, because it gave smaller intervals.

      So it goes.

  48. “… systematic errors are associated with calibration bias in [methods] and equipment… Experimentalists have paid significant attention to the effect of random errors on uncertainty propagation in chemical and physical property estimation… “-Vasquez and Whiting

    And therein lies the rub. Whatever they may be, climate modelers are most certainly NOT experimentalists.

  49. Hi Pat Frank

    This may be of interest to you a ASP colloquium discussing modelling tuning / parameterizations including clouds in the context of model uncertainty

  50. “This is why the uncertainty in projected air temperature is so great. The needed resolution is 100 times better than the available resolution.”

    No, I do not agree… Even if resolution is 100 times better for clouds, it does not follow that they have captured all of the various processes and feed backs that contribute to the overall climate. One has to assume that the models have captured all of the important processes, and that no undiscovered process or feedback will shift a yearly output enough to start a cascade effect. Its called a complex chaotic system (more than one chaotic process is involved).

    One CANNOT predict the behavior of a natural chaotic system within a narrow-margin over enough time. One might be able to say, statistically, there is a “n”% chance of an outcome, but that “n”% will grow to a meaningless margin after enough time. 100 years is simply beyond any models – we just do not understand the climate processes well enough – clouds is just one set of processes we don’t understand well.

    For example, let’s say that there is natural heating (forget CO2 as the magic molecule) and it drives a change in wind patterns over North Africa which carry greater loads of dust over the Atlantic. This then drives changes to condensation, which drives changes in temperatures for a large region, which…etc. The models simply cannot capture all this complexity. If you change the climate for a large enough portion of the Earth (say 10% to 20%) through such an unpredicted change, then the odds that it will impact other regions climate are large.

    Climate models depend on averaging the measurements we think are important over a period we have made measurements in…they cannot predict the unknown, unmeasured, completely surprising changes that will occur. It is pure arrogance to believe we can “see” the effects of trace amounts of CO2 on the climate system.

    Discovering that a complex computer model filled with differential formulas acts in a linear manner doesn’t surprise me one bit. I ran into this kind of behavior again and again when working on mathematical algorithms back in the 1980’s. I keep finding that all my tweaks and added complexity were completely overwhelmed by the already existing established more simple equations. It is a matter of scale…if the complex formulas that make up 99% of the code produce only 1% of the output, and the other 1% of the simple code produces 99% of the output then you will get a nasty surprise when you finally analyze the result. That is what you just did for them.

    • Your example of increased dust from North Africa entering the Atlantic is even more complex. One result of that can be an increase in plankton by orders of magnitude from year to year. This then results in a similar increase in recruitment of various fish species dependent upon timing of plankton blooms (primary production has ramped up big time in the north east atlantic this year as will be seen in plankton surveys) for survival post larval stage.

      All that increase in oceanic biomass is retained energy,not to mention that water with high concentrations of plankton also has the potential to store more heat in a similar way that silt laden shallow water warms quicker and to a higher temperature than clear water. Yes in the big scheme of things the above may be small potatoes (though i think the plankton alone might be a very large potato ) but there are many natural variations and processes that are not even considered in climate models, never mind “parameterized” .

      • “but there are many natural variations and processes that are not even considered in climate models, never mind “parameterized” .”

        It’s what Freeman Dyson pointed out several years ago. The climate models are far from being holistic simulation of the Earth. We know the Earth has “greened” somewhere between 10% and 15% since 1980. That has to have some impact on the thermodynamics, from evapotranspiration if nothing else. It could be a prime example of why temperatures are moderating in global warming “holes” around the globe. Yet I’ve never read where the climate models consider that at all.

        • It’s worse than that. Those are second-order effects which, yes, any half decent scientist would expect to be factored into any ‘grand view’ of the system produced by climate models.

          But the enthalpy of evaporation of water changes by about 5% over temperature ranges experienced on the Earth. When someone postulated the question on Judith Curry’s blog a few years ago “Do climate models take this into account?”, the answer was “No.”
          No scientist should take these models seriously. They are toys, not serious research tools.

    • But the point of Pat Frank’s hypothesis is not that the climate models are right or wrong, only that the uncertainty of projected temperatures is so large that it is not possible to know if the answers are any good or not.

  51. “The paper presented a GCM emulation equation expressing this linear relationship, along with extensive demonstrations of its unvarying success.

    In the paper, GCMs are treated as a black box. GHG forcing goes in, air temperature projections come out. These observables are the points at issue. What happens inside the black box is irrelevant.”

    Who needs models when Par Frank can emulate them (modeling models) with a simple linear equation? Who needs physics when what happens in “the back box is irrelevant”? Who needs science when we have Pat Frank?

  52. There’s a crying need here to clarify how iterative calculations, no matter how precise numerically, may or may not produce highly erroneous, misleading results in modeling. A simple, physically meaningful example is provided by the recursive relationship
    y(n) = a y(n-1) + b x(n)
    where y is the output, n is the discrete step-number and x is the input. Both a and b are positive constants.

    If a + b = 1, then we have an exponential filter, which mimics the response of a low-pass RC circuit. The output is stable and there is no scale distortion. The effect of error in specifying the initial, or any subsequent, input will not accumulate, but will die off exponentially. Only systematic, not random, errors persisting over long-enough intervals will show a pronounced effect in the output. But if a + b > 1, then there’s an instability, i.e. an artificial amplification of output due to ever-increasing scale distortion. This is an unequivocal system specification error that can grow uncontrollably.

    The critical question here should be whether the “cloud forcing error,” while undoubtedly introducing a serious error, can be treated as a system specification error, or is it just a randomly varying error in specifying the system input?

      • Climate modelers indeed have much for which to answer; see second paragraph of my response to Stokes below. But insofar as a plus/minus interval of “cloud forcing error,” upon which Frank insists, smacks of random error in system input, rather than system specification error, the ball is in his court to defend his very specific claim of error propagation.

        BTW, the problem of detecting a very gradually varying signal amidst much random noise depends upon the spectral structure of both and is not simply a matter of overall S/N ratio.

    • “There’s a crying need here to clarify how iterative calculations, no matter how precise numerically, may or may not produce highly erroneous, misleading results in modeling”

      It was done in my post on error propagation in DE’s. You have given a first order inhomogeneous recurrence relation, analogous to a DE. The standard solution I gave was
      y(t) = W(t) ∫ W⁻¹(u) f(u) du, where W(t) = exp( ∫ A(v) dv ), and integrals are from 0 to t

      The analogous solution for your relation is
      y(n)=aⁿΣa⁻ⁱxᵢ … summing from 0 to n, and absorbing b into x (try it!).
      So the issue is whether a is >1, when the sum converges unless x is growing faster, and the errors grow as aⁿ, or <1, when the final term is a⁻ⁿx and the sum approaches a finite limit. This is exactly analogous to my special case 3, and govern what happens to positive and negative eigenvalues of the A matrix. Time discretisation of that linear de gives exactly the recurrence relation that you describe.

      As to the cloud uncertainty, I think it has been misread throughout, but no-one including Pat Frank has been interested in finding out what it really is. As I understand, it comes from the correlation of sat and GCM values at locations separated in time but mainly space. It is predominantly spatial uncertainty, and would largely disappear on taking a spatial average.

      • Nick Stokes: It was done in my post on error propagation in DE’s.

        As to the cloud uncertainty, I think it has been misread throughout, but no-one including Pat Frank has been interested in finding out what it really is.

        What do you think of my proposal of using bootstrapping to get an improved propagation of the uncertainty in the parameter?

        • “my proposal of using bootstrapping to get an improved propagation of the uncertainty in the parameter”
          Well, my first point is that everyone seems to think they know what the parameter is, and I don’t think they do. I think it represents total correlation, or lack of it, over space and time, and predominantly space, so if you take a global average at a point in time, there is a lot of cancellation.

          But I don’t think bootstrapping to isolate that parameter is a good idea, and I’m pretty sure it wouldn’t work. The reason is the diffusiveness. As I mentioned, after a few steps (or days) error propagated from one source is fairly indistinguishable from that from another. There isn’t any point in trying to untangle that. The loss of distinctness would disable bootstrapping, but is a great help in ensemble, because it reduces the space you have to span.

          • Nick Stokes: Well, my first point is that everyone seems to think they know what the parameter is, and I don’t think they do.

            What you also wrote is that the parameter is known because it is written in the program. Every other writer has claimed that the parameter is not known.

            The reason is the diffusiveness. As I mentioned, after a few steps (or days) error propagated from one source is fairly indistinguishable from that from another.

            I am more and more persuaded that you do not understand Pat Frank’s algorithm. But if you are estimating the resultant uncertainty of multiple sources of uncertainty in parameters and other details such as grid size an initial value, bootstrapping can certainly handle that.

            To me there is a disparity between claiming that the only way to propagate error is to run the whole program and saying that bootstrapping can not estimate the uncertainty distribution, when bootstrapping does what you recommend many, many times.

            Anyway, I appreciate your answer.

            As I put all your answers together, there is no way to estimate the uncertainty in the computational results even with infinite computing power.

          • “What you also wrote is that the parameter is known because it is written in the program”
            I said that parameters that are written in the program are known. That is trivially true. The point is that once you have done that you can run perturbations, as I hve shown.

            But the 4 W/m2 is not written into GCMs. It is something that Pat has dug out of a paper. He rests his analysis but neither he nor his fans seems to have the slightest interest in finding out what it actually means. Nor its units.

          • Nick Stokes, “neither he nor his fans seems to have the slightest interest in finding out what it actually means. Nor its units.

            I’ve clarified the units for you ad nauseam, Nick. It never seems to take.

            As to meaning, the (+/-)4 W/m^2 is a lower limit of resolution for model simulations of tropospheric thermal energy flux. Also discussed extensively in the paper. Also something you never seem to get.

            Inconvenient truths of another kind, perhaps.

      • It was done in my post on error propagation in DE’s. You have given a first order inhomogeneous recurrence relation, analogous to a DE.

        I’m keenly aware that linear system behavior can be described by DEs as well as by impulse response functions, continuous or discrete. My aim here was to CLARIFY the different types of error that can arise in ITERATIVE calculations in terms more far more accessible to the WUWT audience than those implicit in DE theory.

        Moreover, I’ve long pointed out that the very concept of “cloud forcing” prevalent in “climate science” is physically tenuous, at best. While clouds, no doubt, modulate the SW solar forcing in producing the insolation that actually thermalizes the surface, their LWIR is merely an INTERNAL system flux, inseparably tied to the terrestrial LWIR insofar as actual heat transfer to the atmosphere is concerned. The uncertainty that arises from conflating these two distinct functions goes beyond any uncertainty of cloud fraction arising from lack of model-grid resolution.

  53. “Discovering that a complex computer model filled with differential formulas acts in a linear manner doesn’t surprise me one bit. I ran into this kind of behavior again and again when working on mathematical algorithms back in the 1980’s”.

    Back in my Saturn S-IVB days, I joined a project that was using a huge relaxation network to calculate thermal conductivities from ten radiometer measurements. For an offering, the high priests in white coats would grant us the k value a week later. One afternoon I sat down with pencil and paper and discovered that for almost every material, k could be calculated in minutes by averaging the front and back measurements, subtracting, and multiplying by a constant. No computer required. Oopsie.

    • You bring up a good point. Climate doesn’t just change drastically. You don’t see deserts go to lush forests in the wink of an eye. You don’t see marsh lands turn into farm land overnight. These are generally slow, gradual changes that are linear, not exponential.

      If you have a model that can not run continuously without “going off the rails”, then you don’t have all the equations driving it correct. I don’t know how else to say it, you just don’t have it correct. Would it surprise anyone that the models are a Rube Goldberg device that end up depicting a linear output of temperature?

  54. I have a bank account that pays 2 percent interest. They tell me that my money will double because compound interest. But the bank keeps taking these little bits of money out.

  55. All this discussion has centered around just ONE aspect of climate-model uncertainty. Aren’t there other aspects open to an equal amount of argumentation? Put all that together with this, and the conclusion arrived at is … worse than we thought … that is, climate models are worse than useless, as far as being constructive tools for society. They are destructive tools, because of their highly inappropriate application in policy decisions.

    • There is a HUGE uncertainty with the models that no one has addressed yet! Even Pat Frank has only addressed the *average* temperature projection. But the average temperature is pretty much meaningless, it has no certainty associated with it at all when it comes to actual reality.

      Once you take an *average* you lose data. You no longer have any data on what is happening to either the maximum temperatures or the minimum temperatures, both globally and regionally. And it is the maximum temperatures and the minimum temperatures that will determine what happens to the globe, not the average. If high temperatures are moderating and minimum temperatures are as well, the globe is headed into an era of high food productivity and decreased human mortality. It is maximum temps that have the biggest impact on food production during the main growing season yet we know almost nothing about where that is going. Based on the consecutive global record grain harvests over the past six years it is highly unlikely that maximum temps are going up, it is far more likely they are moderating.

      Yet the climate alarmists, and this include the modelers, assume that an increasing average means that maximum temperatures are going up as well. Yet the uncertainty associated with the maximum temps and the average temps tracking is very high. The average can go up because of moderating minimum temps just as easily as it going up because of increasing maximum temps!

      When was the last time you heard of large scale starvation occurring over a long time scale anywhere on the globe? I don’t know of any over the past two decades, not in Africa, Asia, South America, etc. Short term localized problems sure, but nothing widespread and lasting decades. And yet this is what the AGW modelers want us to believe their models are predicting!

      Talk about uncertainty!

      • It’s been more than 40 years since the record high temp for any continent on earth (Antarctica) was broken. The next youngest continental record high temp is Asia’s, set in 1942.

        • “It’s been more than 40 years since the record high temp for any continent on earth (Antarctica) was broken.”

          And yet we are to believe that the Earth is “burning up”? And we’ll all be crispy critters by 2035?

  56. A question for Nick Stokes (or anyone else !!)

    Have any of the GCMs been used to replicate the climate from, say, the year 1650 to say 1800, or have they been able to replicate the climate from the start of the Medieval Warm Period up to the end of the Little Ice Age?
    Or perhaps to model the onset of the most recent ice age and its subsequent demise?

    Since these time periods were before the industrial revolution and presumably before humans could have impacted the climate, there perhaps would be one less “variable” to consider; maybe this will make modeling the climate for these time periods a little bit easier.

    If GCMs have not yet been back tested for time periods prior to the industrial revolution to assess their reliability, why haven’t they been?
    Should not this “test” be conducted to assess the reliability of the models?

    If they have, how did the results compare to the actual climate?

    Many doubts about GCMs presently used would disappear if it could be shown that these models accurately (more or less) can replicate the historical climate. One would think that the AGW proponents would be working overtime to achieve good results in this endeavor.

    You know, in engineering all mathematical models are compared to and/or calibrated to real world experimental and/or field test results. So even if the input parameters are not “exactly” known – say, in your example of an ICE – after enough real world testing, a model(s) can be “calibrated” and/or modified to provide results whose reliability is consistent with that required to model a physical process and perhaps to help make predictions.
    And just as important, this continual comparison of real world vs model results allows the experimenter to KNOW its deficiencies and its in-applicability.

    Anyway, I think it would interest many folks to have answers to my very basic questions.

    • John,
      The simple answer would be that in order to compare the results of computer models to
      past climates two things need to be known. Firstly the climate needs to be known sufficiently
      accurately — a few data points from Europe and China are not enough. Secondly the forcings
      which are input into the climate models are also unknown. Then you need estimates for solar variability,
      volcanic eruptions, CO2 levels etc going back hundreds of years. Neither of these things (the forcings
      and the actual climate) are known to sufficient level to allow the tests you describe to be carried out
      to the level that would convince a skeptic.

      One way to look at it is that a climate model starting from scratch can predict the global temperature to
      within 0.2% (i.e. roughly 0.5 degrees K error out of about 300K). Would the ability to predict temperature to with 0.2% convince you that global climate models are accurate? If not how accurately do you think
      global climate models must be able to predict the temperature before you would accept that they are accurate?

      • “One way to look at it is that a climate model starting from scratch can predict the global temperature to
        within 0.2%”

        Really? Then why is there such a large variation in output among the different models? And why does their output differ from the satellite record by more than this?

        And why did none of the models predict the global warming pause?

        It’s not obvious that they can predict anything to within .2%!

        Nick Stokes says the initial conditions are irrelevant to the models, that the forcing of initial conditions in the models quickly disappears after several steps. So why would actual conditions in 1650 make much difference to the models. Just pick some numbers and go for it!

        • Tim,
          the average temperature of the earth is about 288 degrees Kelvin. Global climate models
          can predict an average temperature of about 288 degrees to within 0.5 degrees. That
          corresponds to an error in temperature of 0.5/288 or about 0.2%.

          • “Tim,
            the average temperature of the earth is about 288 degrees Kelvin. Global climate models
            can predict an average temperature of about 288 degrees to within 0.5 degrees. That
            corresponds to an error in temperature of 0.5/288 or about 0.2%.”

            You didn’t answer a single question I asked. Why is that?

            And you’ve apparently missed the whole conversation about uncertainty associated with the models!

      • Izaak;
        Thanks for your response.

        Just wondering; do GCMs account for solar variability ?
        Apparently this variable is important, so I assume the GCMs do in fact account for this.
        Am I mistaken?

        Also, there is ample documentation of volcanic activity since at least 300 to 500 years ago; certainly since 1900 or so.
        Why then can’t GCM modelers input this information into their models; say since the year 1900?
        Also, do current GCMs account for volcanic activity over, say , the last 50 to 100 years or so?

        Michael Mann produced a temperature record going back to the year 1000; that’s 1000 years of data. If he was able to reproduce the temperature record over this time span, why can’t GCM modelers use his data – or other data – as input to model, say, the last 1000 years or 500 years or 300 years or even since 1900 or so?
        It would seem fairly simple to use a GCM to model the climate since the year 1900 given that Mann was able to replicate more than 1000 years of temperature data.

        It seems to me that “turning on” a GCM in the year 1900 (this is only 120 years ago – very recent) should produce results that model the actual, documented climate fairly accurately . This is much easier than trying to reproduce Mann’s 1000 year record of temperature.
        Has this been done?
        If so, why not?
        I assume that if this “experiment” were conducted, the results would fit very closely the actual climate since the year 1900.
        Don’t you agree?
        Also, if executed it will immediately silence all the “deniers,” since “back predicting” should be much easier than “forward predicting.”

        Thanks again for your initial response and I look forward to the response to my new questions.

        • Hi John,
          The short answer is that it has been done and by using revised estimates of forcings
          climate models can explain 98% of the variability in the climate since 1850. Have a look
          at “A Limited Role for Unforced Internal Variability in Twentieth-Century Warming”
          by Karsten Hausteina et al. at
          https://journals.ametsoc.org/doi/10.1175/JCLI-D-18-0555.1

          But note in order to get this level of agreement they used new estimates for the sea temperatures based on corrections to measurements that almost everyone here would
          disagree with. So as I said originally there are plausible estimates for both forcings and
          temperatures that allow GCM to explain 98% of the temperature variations since 1850.

          Hence my question would be, having read the paper above are you convinced and do you
          think that paper would silence even one denier?

  57. “It is difficult to get a man to understand something, when his salary depends on his not understanding it.”

    somehow this seems relevant here…

  58. Leaving aside the errors that Nick and Dr. Spencer have pointed out there is at least one more
    major error in Dr. Frank’s paper as far as I can see. He proposes the following emulation equation
    for the global temperature (Eq. 1 in his paper):
    ΔTt(K)=fCO2×33K×[(F0+∑iΔFi)/F0]+a
    which can be trivially rewritten as:
    ΔTt(K)=fCO2×33K×[1+(∑iΔFi)/F0]+a
    now none of the terms in this equation have any direct connection with reality. Rather they are
    an attempt to fit the outputs of global climate models. As Dr. Frank says “This emulation equation is not a model of the physical climate”. So there is a very natural question as to which of these parameters
    best describes the effects of long-wave cloud forcing. Dr. Frank claims with no evidence that the long
    wave cloud forcing is represented by the ΔFi term which would appear to be incorrect since he states
    “ΔFi is the incremental change in greenhouse gas forcing of the ith projection time-step”. Hence that
    term only represents the change in greenhouse gas concentrations. The other two terms would appear
    to F0 and a.

    Now “a” represents a constant off-set temperature in Eq. 1 and by simple error analysis an error in “a”
    would give a constant off-set temperature independent of time. The errors would not grow in the fashion that
    Dr. Frank describes. The other alternative would be that the 4 W/m^2 represents an error in the estimate of
    F0. Which does at least have the right units. But then Eq. 5.1 should read:
    ΔTi(K)±ui=0.42×33K×[1+ΔFi/(F0±4Wm2)]
    which gives for Eq. 5.2
    ΔTi(K)±ui=0.42×33K×[1+ΔFi/F0]±[0.42×33K×4Wm2/(F0*F0)]
    which is smaller than Dr. Frank’s estimate by a factor of F0 which is about 30. The additional factor of
    F0 comes from Eq. 3 when you take the derivative of 1/F0 to get 1/F0^2.

    We now have an error term that is 30 times smaller than Dr. Frank’s estimate using his same methodology
    and if we the propagate that for 100 years as he did the net result of the error in forcing is about 0.5 degrees
    which is significantly smaller than the temperature change due to CO2 levels. Thus Dr. Frank is wrong about
    GCM models being unable to predict global temperatures because the error terms are too large.

    • ΔTt(K)=fCO2×33K×[(F0+∑iΔFi)/F0]+a
      which can be trivially rewritten as:
      ΔTt(K)=fCO2×33K×[1+(∑iΔFi)/F0]+a

      Try your math again.

      (∑iΔFi)/F0) * (1/F0) is not (∑iΔFi)/F0)

      Dr. Frank also states “In equations 5, F0 + ΔFi represents the tropospheric GHG thermal forcing at simulation step “i.” The thermal impact of F0 + ΔFi is conditioned by the uncertainty in atmospheric thermal energy flux. ”

      F0 is not the uncertain flux value, F0 + ΔFi is. So your factor F0±4Wm2 is incorrect.

      • Hi Tim,
        The point is that the uncertainty in thermal energy flux corresponds to an error in F0
        not in Delta Fi which is the additional forcing due to rising CO2 simulations. The change
        in forcing due to CO2 levels are well known but there is still an error in the other terms
        which must correspond to errors in F0.

        • Izaak, “The point is that the uncertainty in thermal energy flux corresponds to an error in F0…

          No, it does not. F_0 is the constant unperturbed GHG forcing, calculated from Myhre’s equations. Totally muddled thinking, Izaak.

          • Dr. Frank,
            You explicitly claim that your equation one is not a model of the physical world
            rather it is a emulation model of the global climate models. Hence you cannot
            calculate the value of F0 using values from a physical theory but rather it must
            simply represent a best fit to the outputs of GCMs.

            Suppose that a GCM is run with no increase in CO2 levels. Would you claim that
            in that case it would have no error? Or would there still be an error due to the
            long wave cloud forcing uncertainty? In which case which term in your Eq. 1 with
            Delta F=0 represents that error?

          • Izaak, F_0 is projection year t = 0 forcing. I explain that in the paper, and how it’s calculated.

            You wrote, “Suppose that a GCM is run with no increase in CO2 levels. Would you claim that in that case it would have no error?

            My paper is about uncertainty, not error. How many times must this be repeated?

            The uncertainty comes from deficient theory within the model. Do you suppose the theory remains deficient inside the model, whether CO2 is increased or not?

            Or would there still be an error due to the long wave cloud forcing uncertainty?

            If a GCM is simulating the climate with (delta)F_i = 0, does the deficient theory within the model still cause it to simulate global cloud fraction incorrectly?

            If the answer is yes (and it is), then why should there not be an uncertainty in simulated LWCF?

            In which case which term in your Eq. 1 with Delta F=0 represents that error?

            Look at the right side of eqn. 5.2, Izaak. That answers your question.

            In eqn. 1, when (delta)F_i = 0, and a = 0, then the eqn. calculates the unperturbed green house temperature, with f_CO2 varying with the given climate model.

            When f_CO2 = 0.42, the Manabe and Wetherald fraction (see SI section 2), the unperturbed temperature is 13.9 C.

            You clearly need to do some careful reading.

        • Izaak,

          The term F0 + ΔFi represents the input to the next iteration of the model. The entire term is what is uncertain, not just F0.

          Error is the difference between a measurement and the true value. That is not uncertainty. Uncertainty envelopes the range of possible errors but does not specify the actual error itself.

          It’s why uncertainty doesn’t cancel out like random errors do according to the central limit theory – uncertainty is not a random variable.

          • Tim,
            There is no term “F0 + Delta Fi” in Dr. Frank’s equation. The full expression is
            (F0+Delta Fi)/F0
            which can be rewritten as
            1+ Delta Fi/F0
            where Delta Fi and only Delta Fi is the input to the next iteration of the model. F0 is
            a constant that represents some element of the physics while Delta Fi is the additional
            forcing due to rising CO2 levels.

          • “The full expression is
            (F0+Delta Fi)/F0
            which can be rewritten as
            1+ Delta Fi/F0”

            Really nice math skills, eh?

            [(F0+Delta F)/F0]*(F0/F0)
            becomes
            (F0)[1 + (Delta F/F0)]
            which becomes the value for the next iteration.

            Why did you drop the F0 multiplicative term?

            Rather than shooting from the hip on this why don’t you read Frank’s writings?

          • Just because F_0 can be factored out doesn’t change anything, Izaak.

            The uncertainty term on the right side of eqn. 5.2 remains the same.

            Just curious, Izaak — doesn’t your “(F0+Delta Fi)/F0” contain the term F0+Delta Fi, which is the term you wrote is not present?

          • Dr. Frank,
            As you surely know
            (F0+ Delta Fi)/F0 = 1+Delta Fi/F0
            This means that errors in F0 have a proportionally smaller
            effect than an error in Delta Fi. The error term for F0 is proportional
            the derivative of the emulation equation with respect to F0 so it
            is Delta F/F0^2 where Delta F is the uncertainty in F0 which must include
            the uncertainty in the long wave cloud forcing.

          • “This means that errors in F0 have a proportionally smaller
            effect than an error in Delta Fi. ”

            This isn’t about errors! How many times must that be repeated. It is about uncertainty!

          • Izaak, F_0 is a given. It has no error.

            The (delta)F_i is a given. It has no error.

            The (+/-) comes from calibration uncertainty due to model simulated cloud fraction error.

            You are getting everything wrong.

            The reason you are getting everything wrong is because you are imposing your mistaken understanding on what I have written.

    • Izaak Walton, “Leaving aside the errors that Nick and Dr. Spencer have pointed out…

      Another guy who thinks a calibration error statistic is an energy flux. There’s a good critical start.

      IW, “Dr. Frank claims with no evidence that the long wave cloud forcing is represented by the ΔFi term …

      Not correct. Nowhere do I claim the ΔFi is long wave cloud forcing.

      IW “The errors [in term a]would not grow in the fashion that Dr. Frank describes.

      I describe no such error.

      IW “The other alternative would be that the 4 W/m^2 represents an error in the estimate of
      F0.

      How about the alternative it actually means, Izaak, which is the CMIP5 calibration error statistic in long wave cloud forcing.

      And it’s ±4 W/m^2, not +4 W/m^2. None of my critics seem able to get the difference straight. The difference between ± and + is deep science I know, but still…

      I was going to go through the rest of your analysis, Izaak, but it is so muddled there’s no point.

      • Dr. Frank,
        Again the point is that no of the terms in your emulation equation correspond to physical
        effects. Which is what you explicitly state — rather they are fitting parameters to the outputs
        of global climate models. As such there is no one-to-one correspondence between the parameters
        in a GCM and your emulation equation. So why would the CMIP5 calibration error statistic
        directly enter your equation exactly as does a changing in Forcing due to rising C02 ? Rather it
        should enter as a error in the estimate of F0 which includes all of the forcings due to the
        current levels of greenhouse gases.

        • ” Rather it
          should enter as a error in the estimate of F0 which includes all of the forcings due to the
          current levels of greenhouse gases.”

          To repeat (seemingly endlessly) error and uncertainty are not the same thing. You require a specific number for the size of the error. Uncertainty cannot give you that specific number, it can only tell you the range that the error can occur in. As you perform successive iterations that range of possible error gets bigger. Which is exactly what Dr Frank’s analysis shows.

          • Tim,
            the issue is which number in Dr. Frank’s equation is uncertain. Suppose all of the
            error lies in the offset value “a”. This would not propagate through in the manner
            Frank suggestions but would rather just give a constant error. Errors in Delta F
            have a bigger effect.

          • “the issue is which number in Dr. Frank’s equation is uncertain.”

            Huh? Dr. Frank explains this very well, there is no uncertainty in his presentation.

            “F0 is the total forcing from greenhouse gases in Wm–2 at projection time t = 0, and ΔFi is the incremental change in greenhouse gas forcing of the ith projection time-step, i.e., as i-1→i.”

            “The finding that GCMs project air temperatures as just linear extrapolations of greenhouse gas emissions permits a linear propagation of error through the projection. In linear propagation of error, the uncertainty in a calculated final state is the root-sum-square of the error-derived uncertainties in the calculated intermediate states (see Section 2.4 below) (Taylor and Kuyatt, 1994).”

            “The thermal impact of F0 + ΔFi is conditioned by the uncertainty in atmospheric thermal energy flux. ”
            ——————————

            “Suppose all of the error lies in the offset value “a”.”

            The value of “a” is calculated from the curves of the various GCM’s. See Table S4-3 in Frank’s supplementary documentation. If “a” has an error it is caused by errors in the curves from the GDM’s! Meaning the GCM’s are the source of uncertainty!

          • So Tim,
            A question for you — why is it possible to know F0 with absolute precision
            and ΔFi only to within an accuracy of +/1 4 W/m^2? Surely if F0 is the
            “total forcing from greenhouse gases” then that term contains the uncertainty of
            +/- 4 Wm^2? In contrast ΔFi is the input to the model and is used to make
            predictions. Hence in principle it is know exactly. All of the error must be in
            F0.

            Another reason why the error must be in F0 is consider what would happen if you ran a climate model with zero additional forcing (i.e. constant CO2 levels).
            In that case ΔFi =0 exactly but there would still be an uncertainty in the long wave cloud feedback. The only way that uncertainty could enter Dr. Frank’s model is through the parameter F0.

          • The uncertainty range, however large it is, does not say that the answer is wrong. It could be exactly correct to the precision calculated but there is no way to know that it is except to wait out the time and find out what happens. The calculated answer could be very correct or wildly wrong. With such large uncertainty there is no way to even make a reasonable guess which.

          • Izaak Walton, “A question for you — why is it possible to know F0 with absolute precision and ΔFi only to within an accuracy of +/1 4 W/m^2?

            The ΔFi are the SRES and RCP forcings from the IPCC, Izaak. They are givens, and have no uncertainty. That point is clearly made in the paper.

            The (+/-)4 W/m^2 uncertainty in simulated tropospheric thermal energy flux comes from the models. Let me repeat that: it comes from the models.

            It means the models cannot simulated the tropospheric thermal flux to better than (+/-)4 W/m^2 resolution, all the while they are trying to resolve a 0.035 W/m^2 perturbation.

            The perturbation is (+/-)114 times smaller than they are able to resolve. The models are a jeweler’s loupe, and people claim to see atoms using it.

        • Izaak Walton, “So why would the CMIP5 calibration error statistic directly enter your equation exactly as does a changing in Forcing due to rising C02 ?

          Because it is an uncertainty in simulated tropospheric thermal energy flux, Izaak. This point is discussed in the paper.

          GHG forcing enters the tropospheric thermal energy flux, and becomes part of it. However, the resolution of the model simulation for this flux is not better than (+/-)4 W/m^2.

          This means the uncertainty in simulated forcing is (+/-)114 times larger than the perturbation in the forcing.

          This is why the (+/-)4 W/m^2 enters into eqn. 1 at eqn. 5. It includes the impact of simulation uncertainty on the predicted effect. It means the effects due to GHG forcing cannot be simulated to better than (+/-)4 W/m^2.

          It is impossible to resolve the effect of a perturbation that is 114 times smaller than the lower limit of resolution.

          • Dr. Frank,
            You stated in the comment about that the Delta Fi are givens and have no
            uncertainties. In that case the errors in the models must enter your emulation
            equation through the parameter F0. Is this not correct?

          • No, that is not correct, Izaak. The F_0 is also a given. It has no error.

            From page 8 in the paper: “Lauer and Hamilton (2013) have quantified CMIP3 and CMIP5 TCF model calibration error in terms of cloud forcings.

            That is from where the (+/-) uncertainty comes.

            It is not in F_i, it is not in F_0. It is in the simulation.

  59. The GCM’s are a hoax.

    Dr. Frank correctly demonstrates that the aggregation of error vastly exceeds the relatively minor temperature change predictions in GCMs, but this alone is not damning enough for diehard adherents.

    Others cite the well-known inaccuracy of the predictions, which also ought to be proof of invalidity. However, it is also true (though rarely stated) that GCM inaccuracies are one-sided: they all overestimate predicted temperatures. They all shoot high. There are no undershooting GCM’s. They miss, but always in one direction.

    The black boxes are rigged, obviously. All the palaver about physical laws and balances is snake oil. If the GCMs were fair dice, some would guess too low and some too high. But they don’t. The dice are loaded.

    Defenders of GCMs are dishonest. This is a logical conclusion. Defenders may be scientifically literate in the particulars, but to ignore the overwhelming one-sided bias in the outputs is not scientific. It is intentional blindness. Intentional, as in they ignore the obvious on purpose — which is unforgivable mendacity in support of a hoax.

    • I tend to agree that it’s “unforgivable mendacity”, but I’m prepared to allow for simple self-delusion, combined with a certain “I’m never wrong” arrogance (CC advocacy does seem to attract a certain character type!)

      Or as Nick might say, with plaintive desperation: “Maybe the Earth’s climate is running cool.”

    • ren – please consider putting together a guest post for WUWT setting out your knowledge and views from start to finish.

      Your in-line comments and links suggest you have an informed handle on the here and now workings of the global weather system, but, appearing as they do as random arrivals, it is hard to piece together what your “future climate big picture” is.

      Thanks, and apologies if you have done this already but I’ve missed it. If so, please post a link.

  60. Miskolczi showed how to model radiative flux in the atmosphere.
    His modelling was closely constrained by experimental measurements, from radiosonde balloons.
    Not the theory-only models refuted here by Pat Frank.

    What has happened to radiosonde balloons? Do they exist anymore? Does anyone use them anymore?
    Or is everyone terrified that if they use radiosonde data they will make themselves look like Ferenc Miskolczi?

    To quote the great man:

    It seems that the Earth’s atmosphere maintains the balance between the absorbed short wave and emitted long wave radiation by keeping the total flux optical depth close to the theoretical equilibrium values.

    On local scale the regulatory role of the water vapor is apparent. On global scale, however, there can not be any direct water vapor feedback mechanism, working against the total energy balance requirement of the system. Runaway greenhouse theories contradict to the energy balance equations and therefore, can not work.

    An other important consequence of the new equations is the significantly reduced greenhouse effect sensitivity to optical
    36
    depth perturbations. Considering the magnitude of the observed global average surface temperature rise and the consequences of the new greenhouse equations, the increased atmospheric greenhouse gas concentrations must not be the reason of global warming. The greenhouse effect is tied to the energy conservation principle through the [not copy-able] equations and can not be changed without increasing the energy input to the system.

    To stop Miskolczi publishing his findings as a NASA scientist (as he then was) a colleague logged into Miskolczi’s account without his knowledge, went to the journal site (archiv) as withdrew the submitted paper. That’s the kind of people climate scientists are. He was immediately dismissed from NASA, and his work was so violently marginalised by Mafia-like activity by NASA and the warmist crowd that he published it in a journal of his native country Hungary:

    file:///C:/Users/PhilS/Documents/HOME/climate/miskolczi%20in%20hungarian%20journal.pdf

    https://www.friendsofscience.org/assets/documents/The_Saturated_Greenhouse_Effect.htm

    https://friendsofscience.org/assets/documents/E&E_21_4_2010_08-miskolczi.pdf

    Miskolczi is the only serious atmospheric modeling of radiative flux that exists to date.

  61. The BIGGEST problem with all climate models are that the caveats are NOT PRINTED LARGE ENOUGH for the politician (including all the numbskulls in the UN) to notice or understand adequately.

    The current batch of climate models are only a first approximations for how the climate evolve, they are not particularly accurate, or complete in assessing this planet’s many climatic factors, or particularly trustworthy in their output. They are a research tool nothing more.
    Climate Models are WORK IN PROGRESS, and may ultimately be proved to be entirely erroneous, or of very low utility.

    IMO every other page of the UN-IPCC reports should have the words —
    “CLIMATE MODELS RESULTS ARE ONLY APPROXIMATE!”

  62. Pat Frank
    “Izaak Walton, “Leaving aside the errors that Nick and Dr. Spencer have pointed out…”
    Another guy who thinks a calibration error statistic is an energy flux. There’s a good critical start.”

    A key to understanding (and hence being able to intelligently critic) Pat Frank’s analysis starts with the above comment. “Error statistic” vice “energy flux”. The prior represents uncertainty, the latter infers error. Understanding these two terms in the context of PFs analysis is paramount. Failure to agree on a common understanding of these terms results in discussion and critic that has no solid base. Its rather like some of the discussions involving radiation energy transfer…….

    There are some posters on WUWT that are truly gifted in presenting complex concepts in ways that many different people can digest, I ask them to continue to present explanations, examples, and discussion of this core subject. This is where real progress in meaningful discussion of AGW can be made.

    • It’s worse than that, Ethan. Those folks think the calibration error statistic itself represents an energy flux.

      Not the error in an energy flux, but an energy flux itself. As in something that perturbs the modeled climate.

      They have less than no clue.

      • Pat,thanks. Hence my comments to work on the root problem many are having with the definition or concept of outcome uncertainty Vs data error, or as you adroitly state, treating outcome uncertainty as an actual data variation within the process creating the outcome. Agreed that this is a glaring problem in understanding your analysis. Unless progress is made in a common understanding of these terms, no progress can made in evaluating the actual merits of your analysis.

  63. To get some clarity, couldn’t you try to prove the uncertainty and just run the model a thousand times with random cloud fraction averages for each year within a plausible parameter and see how much variation you get in the total warming? Or to put it another way, can anyone clearly explain why this debate is so redundantly stuck in the mud?

    • can anyone clearly explain why this debate is so redundantly stuck in the mud?

      Dana, I don’t know, but I can guess. The error in predicting (estimating) low-level cloud cover and high level cloud cover is the stake in the heart of the man-made dangerous climate change vampire. It is the one thing that completely destroys the IPCC’s estimates of human influence on climate change and they can do nothing about it. Cloud cover error swamps the measurement of human influence and cloud cover defies all attempts to estimate it, it happens on too small a spatial scale. No papers have been attacked more ruthlessly (and with so little justification) as Richard Lindzen’s (Lindzen and Choi, 2011) on this very subject and I predict the same with Pat Frank’s, probably with less justification. Nick Stoke’s and Roy Spencer’s criticisms are hairsplitting and make no difference in my opinion, for what that is worth.

      • Ever try to get an elephant out of the mud it was stuck in? (^_^)

        Something that big and set is really hard to move. The “stuckness” is NOT because of those trying to free the poor beast, but because of the sheer weight of the beast who cannot free itself because of its own weight. And, in case you can’t see the metaphor, the massive weight here is on the side of the critics of Pat’s analysis.

        Congratulations on not being an elephant.

    • That would not give you uncertainty error. Each run would have the same uncertainty. That is the whole point. The output is uncertain regardless of what you do with the run. Think of it this way, how certain are you that the program gives you the correct value when you run it. (This doesn’t include what you think is correct output.) In other words, how reliable do you think the output is when considering all the inputs and relationships in the program?

    • dana_casun: To get some clarity, couldn’t you try to prove the uncertainty and just run the model a thousand times with random cloud fraction averages for each year within a plausible parameter and see how much variation you get in the total warming?

      That is part of the idea of bootsrapping (though sampling would not be done exactly as you describe). The problem is that the GCMs take a long time to run on current computing equipment. No one, apparently, want, to contemplate actually waiting for the results of a thousand runs.

  64. I provide the following in hopes it will help to foster common ground with the use of terminology.

    Term definitions, NO context (ie common usage, google search result):

    Uncertainty: “the state of being uncertain”
    Uncertain :”not able to be relied on; not known or definite”
    Error: “a mistake”

    Term discussions, from The NIST Reference on Constants, Units, and Uncertainty (https://physics.nist.gov/cuu/Uncertainty/glossary.html), see
    “NIST Technical Note 1297
    1994 Edition
    Guidelines for Evaluating and Expressing
    the Uncertainty of NIST Measurement Results”

    https://emtoolbox.nist.gov/Publications/NISTTechnicalNote1297s.pdf

    Uncertainty :”parameter, associated with the result of a measurement, that characterizes the dispersion of the values that could reasonably be attributed to the measurand”

    The following quotes are from NIST Technical Note 1297 (link above).

    “NOTE – The difference between error and uncertainty should always
    be borne in mind. For example, the result of a measurement after
    correction (see subsection 5.2) can unknowably be very close to the
    unknown value of the measurand, and thus have negligible error, even
    though it may have a large uncertainty (see the Guide [2]).” (Page 7)

    “NOTES
    1 The uncertainty of a correction applied to a measurement result to
    compensate for a systematic effect is not the systematic error in the
    measurement result due to the effect. Rather, it is a measure of the
    uncertainty of the result due to incomplete knowledge of the required
    value of the correction. The terms “error” and “uncertainty” should not
    be confused (see also the note of subsection 2.3).” (Page 9)

    “2 In general, the error of measurement is unknown because
    the value of the measurand is unknown. However, the
    uncertainty of the result of a measurement may be
    evaluated.” (Page 20)

    “3 As also pointed out in the Guide, if a device (taken to
    include measurement standards, reference materials, etc.) is
    tested through a comparison with a known reference
    standard and the uncertainties associated with the standard
    and the comparison procedure can be assumed to be
    negligible relative to the required uncertainty of the test, the
    comparison may be viewed as determining the error of the
    device.” (Page 20)

    “In fact, we recommend that the terms “random uncertainty”
    and “systematic uncertainty” be avoided because the
    adjectives “random” and “systematic,” while appropriate
    modifiers for the word “error,” are not appropriate modifiers
    for the word “uncertainty” (one can hardly imagine an
    uncertainty component that varies randomly or that is
    systematic).” (Page 21)

    • Thanks Ethan, very helpful. Especially this:

      In fact, we recommend that the terms “random uncertainty” and “systematic uncertainty” be avoided because the adjectives “random” and “systematic,” while appropriate modifiers for the word “error,” are not appropriate modifiers for the word “uncertainty”

      • Andy May and Ethan Brand: In fact, we recommend that the terms “random uncertainty” and “systematic uncertainty” be avoided because the adjectives “random” and “systematic,” while appropriate modifiers for the word “error,” are not appropriate modifiers for the word “uncertainty”

        I was going to elaborate a little on this: since its inception, mathematical probability has been used as a model for two sorts of things; (a) confidence (or uncertainty) in a proposition, or outcome (called epistemic or personal probability and (b) relative frequency of event outcomes subject to random variability (called “aleatory” probability.) Not everyone agrees that the two uses have been equally successful in applications. While it is obvious that random variability in the phenomena (hence data collected from studies) produces uncertainty in any parameter estimates (or other summaries of what might be called “noumena” or “empirical knowledge”), it is not obvious how to translate the random variation in the data to uncertainty in the outcome of research.

        One way, illustrated by Pat Frank, is to take the estimate of standard deviation of the random variation in the parameter estimate as a representation of the uncertainty in the point estimate, treat that uncertainty via a distribution of possible outcomes that have the same standard deviation, and propagate that standard deviation through subsequent calculations, as the standard deviation itself has been derived by propagating the standard deviation of the random part of the observations through the calculations that produce the parameter estimate. That provides an estimate of the random variation in the forecast errors induced by the random variation in the observations.

        Note how loaded some of the terms are: “random” variation is variation that is not reproducible or predictable, but you can get sidetracked into the question of whether it is in some sense “truly random” (which Einstein objected to by mocling the idea that God throws dice), or “merely empirically random” because there are gaps in the knowledge of the causal mechanism producing the outcome. Operationally they can’t be distinguished because the empirical random variation (the most thoroughly replicated research outcome in all of science) makes measurement of any hypothetical true random variation subject to great uncertainty.

        “Estimate”, “estimation” and “approximation” dominate the discussion. Only an estimate of the “true” value of the parameter is available from data. Only an estimate of the random variaion in the data is available, not its true distribution. All the mathematical equations are approximations to the true relations that we might be hoping to know. And the result of an uncertainty propagation of the sort carried out by Pat Frank is an approximation to the random variation in the forecasts induced by the random variation in the data. One can elaborate this endlessly, and find many books on the topic in the philosophy of science and statistics sections of a university library.

        One is left with the usual questions of pragmatism in applied math: are the approximations accurate enough? How can you tell whether they are accurate enough? Have they been shown to be accurate enough? Accurate enough for some purposes but not for others?

        And so forth.

        Getting back to the specific case. Is it true as claimed by Nick Stokes that Pat Frank’s procedure implements a random walk? No. Conditional on a parameter unserted into the model, and the increases in CO2 year by year, the sequence of yearly point estimates would be deterministic. Is it true that you can not propagate uncertainty without propagating particular errors, as claimed by Nick Stokes? No. If you are willing to treat the uncertainty in the parameter estimate as resulting from the random variation in the data, you can propagate the uncertainty in forecasts as the standard deviation of the forecast random variation? Is the estimate of the standard deviation of the parameter estimate good enough for this propagation? Well, people seem willing to accept the estimate of the parameter, and the empirical estimate of its uncertainty shows that it is indistinguishable from 0. That information is too important to ignore, but ignoring it has been the standard up till now. Is there a better procedure for propagating the uncertainty in the parameter estimate? The only alternative proffered in this series of interchanges is bootstrapping, which will not be possible until available computers are about 1000 times as fast as the supercomputers available now.

        Is there a better estimate of the uncertainty in the GCM forecasts than what Pat Frank has calculated? Not that anyone has nominated. Is there any reason to think that Pat Frank’s calculation has produced a gross overestimate of the uncertainty? Not that anyone has mentioned so far.

        • Is it true as claimed by Nick Stokes that Pat Frank’s procedure implements a random walk? No. Conditional on a parameter unserted into the model, and the increases in CO2 year by year, the sequence of yearly point estimates would be deterministic. Is it true that you can not propagate uncertainty without propagating particular errors, as claimed by Nick Stokes? No. If you are willing to treat the uncertainty in the parameter estimate as resulting from the random variation in the data, you can propagate the uncertainty in forecasts as the standard deviation of the forecast random variation

          While yearly point estimates of a variable are produced deterministically by GCMs, the estimates themselves need not be at all predictable. The chaotic nature of numerical solutions for N-S equations virtually guarantees an apparent randomness, which needs to be properly characterized.

          Unless one is willing to fly on the wings of naked probabilistic premise, this necessarily involves such physical constraints as conservation of energy, inertia, etc. Thus, there are definite limits to how fast and far physically-anchored uncertainty can grow. And if one is interested not in specific yearly predictions but only in some very-low-frequency features often called “trends,” then the uncertainty can be narrowed even further. (In practice, as time advances beyond the effective prediction horizon set by autocorrelation, physically-blind Wiener filters devolve to simply projecting the mean value of the random signal.)

          If you want to do realistic science, you CANNOT recursively “propagate the uncertainty in forecasts as the standard deviation of the forecast random variations.” Mistaking the practical nature of the problem as one of point-prediction of the accumulation of essentially uncorrelated random increments indeed leads to an ever-expanding random walk.

          • sky,

            “Mistaking the practical nature of the problem as one of point-prediction of the accumulation of essentially uncorrelated random increments indeed leads to an ever-expanding random walk.”

            Uncertainty is not random and therefore cannot lead to a random walk. You are confusing error with uncertainty. Even the true answer is contained in an uncertainty interval, the uncertainty just makes it impossible to *know* what the true answer is. Someone earlier used the example of a crowd size estimate for an event. If you tell someone that you are 99% sure it will be between 400 and 600 then exactly what will the crowd size turn out to be? Even if it is a ticketed event and all the tickets are sold there will be an uncertainty interval because of people that simply won’t be able to make it to the event.

          • Nowhere do I claim that uncertainty itself is random. What I’m trying to convey is that recursive projection of a confidence interval would be correct if one were trying to actually predict a random walk. Otherwise, it should remain fixed, or in the case of nonstationary processes, change along with changes in signal variance.

          • 1sky1: If you want to do realistic science, you CANNOT recursively “propagate the uncertainty in forecasts as the standard deviation of the forecast random variations.” Mistaking the practical nature of the problem as one of point-prediction of the accumulation of essentially uncorrelated random increments indeed leads to an ever-expanding random walk.

            I think that post is incidental to what I wrote. I did not, for example, describe “recursively” propagating the uncertainty, and I explicitly wrote that Pat Frank’s procedure does not create a random walk.

            We are getting a little afield, but imagine choosing randomly between 2 6-sided dice; one has the usual markings, a dot on one side, …, up to 6 dots on the last side; the other has 1 dot on each side. After choosing, you throw the die 20 times; for the first die the outcomes of the first die are random variables; for the second die, the outcomes are a non-random sequence of 1s. If the outcomes are added to a running sum, the first case produces a random walk; the second does not produce a random walk. That illustrates the possibility that, conditional on the random outcome, the subsequent trajectory can be non-random.

            If one conceptualizes the “choice” of cloud feedback parameter as a random outcome from a distribution with a large variance, hence an uncertain estimate of the true parameter, the subsequent series of yearly forecasts (given a sequence of CO2 values) is non-random, not a random walk. The uncertainty in the final year prediction results from the uncertainty of the outcome of estimating the cloud feedback parameter, not on the random variation of the cloud feedback parameter from year to year.

            However one may object to representing uncertainty as a probability distribution of possible parameter values, states of “knowledge”, “bets”, mentation, or “choices”, Pat Frank’s procedure does not produce a random walk. Whatever the unknown error in the parameter estimate is, that error, and hence uncertainty over its value, propagates deterministically. If at some later date we were to learn exactly what the error had been all along, we could adjust the annual forecasts accordingly, as Nick Stokes describes for the meter stick analogy. Meanwhile we are uncertain as to whether the error was small, medium, or large relative to our purposes, and the uncertainty propagates as described by Pat Frank.

          • Tim Gorman and Matthew Marler are exactly right.

            Over and over, we see critics seeing predictive uncertainty as physical error. It’s not.

            Exactly this mistake powers 1sky1’s argument about random walks, as both Tim and Matthew point out.

            I’ve put together a set of extracts from the literature on the meaning and calculation of predictive uncertainty. I’ll post the whole thing below.

            But here, because it’s particularly relevant, I’ll quote an extract from Kline, 1985:

            Kline SJ. The Purposes of Uncertainty Analysis Journal of Fluids Engineering. (1985) 107(2), 153-160.

            The Concept of Uncertainty

            An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable. (my bold)”

          • I did not, for example, describe “recursively” propagating the uncertainty, and I explicitly wrote that Pat Frank’s procedure does not create a random walk.

            While the word “recursively” was not explicitly used, it’s implicit in Pat Frank’s uncertainty propagation scheme, to wit:

            Following from equation 4, the uncertainty “u” in a sum is just the root-sum-square of the uncertainties in the variables summed together…

            Such step-wise increase is precisely how uncertainty of an N-step random walk behaves. As I already explained to Tim Gorman, I did not intend to suggest that the uncertainty itself produces a random walk. Instead of “leads to,” I should have written “points to.”

            The elementary demonstration of the difference between random and deterministic outcomes of die-tossing may be instructive to laymen. It scarcely reveals anything, however, about the specific effect of “CFE error” in actual model forecasts. This effect is unlikely to be deterministic in an “ensemble” average, because GCMs don’t uniformly rely upon any constant “cloud feedback parameter,” but allow the effect to vary differently from year to year in every model. Moreover, some models may treat the parameterized cloud effect not as “feedback,” but as a “forcing.”

            While Frank treats this effect as a systematic “calibration error,” in analogy to a static laboratory test, the actual dynamic effect upon the forecast time series is quite different. Random error need not be white noise, as he assumes; it may be highly autocorrelated at small lags.

            I’ll amplify on these points as I find time to spare.

          • 1sky, “Such step-wise increase is precisely how uncertainty of an N-step random walk behaves.

            An uncertainty analysis is not about error. Whether the math is the same, or not, is irrelevant.

            As I already explained to Tim Gorman, I did not intend to suggest that the uncertainty itself produces a random walk. Instead of “leads to,” I should have written “points to.”

            A distinction without a difference. An uncertainty analysis does not imply error trajectories.

            This effect is unlikely to be deterministic in an “ensemble” average, because GCMs don’t uniformly rely upon any constant “cloud feedback parameter,” but allow the effect to vary differently from year to year in every model.

            Yes, and the average uncertainty interval every year in LW cloud feedback is (+/-)4 W/m^2.

            While Frank treats this effect as a systematic “calibration error,” in analogy to a static laboratory test, the actual dynamic effect upon the forecast time series is quite different.

            I treat it as a model calibration error statistic, which is what it is.

            Random error need not be white noise, as he assumes; it may be highly autocorrelated at small lags.

            I assume nothing about error.

            Over and yet over again, the same mistake.

            The paper is about uncertainty intervals, and you go on about error. Stop already.

          • “I assume nothing about error.

            Over and yet over again, the same mistake.

            The paper is about uncertainty intervals, and you go on about error. Stop already.”

            If it were a mule it would be time to get out the 2×4!

          • Compare:

            An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable.

            with

            An uncertainty analysis is not about error. Whether the math is the same, or not, is irrelevant… An uncertainty analysis does not imply error trajectories.

            Logically, you can’t have it both ways. Repeating the ambiguous mantra that uncertainty is not error simply doesn’t cut it. A statistical specification of the error-range clearly is not the same as any specific error. But unless that range specification is about the ensemble of errors (and about their time-series trajectories) then it’s a meaningless conceit.

            I hope to find time this week to address the multiple mischaracterizations here of the actual significance of the empirically estimated (+/-)4 W/m^2.

          • 1sky1, “Logically, you can’t have it both ways. Repeating the ambiguous mantra that uncertainty is not error simply doesn’t cut it.

            Here’s what bothers you: “An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable. from Kline 1985.

            And my, “An uncertainty analysis is not about error. Whether the math is the same, or not, is irrelevant… An uncertainty analysis does not imply error trajectories.

            Where’s the conflict?

            Error might take on various values. In a futures projection one cannot know what those values might be.

            If we can’t know the error values, we can’t know the error trajectory.

            If one doesn’t know the error trajectory, one doesn’t know the physically correct trajectory, either.

            Whatever the uncertainty interval, it is not the error.

            That uncertainty statistic in the case of my paper is the annual rmse (+/-)4 W/m^2 LWCF GCM calibration error statistic. Simulation error is somewhere within that interval, but one cannot know where.

            Propagation of uncertainty is the only way to determine the reliability of an air temperature futures projection.

          • Where’s the conflict?

            The conflict lies in the quaintly inconsistent misuse of mathematical terminology that is endemic here. Error in modeling physical time series is the discrepancy between the modeled and the observed values as a function of time. Uncertainty is the specification of the statistical properties (e.g., mean and variance) of the time series of error. That statistical specification, whether for the sample at hand, or for the presumed ensemble of error-series, is a fixed parameter that does NOT vary randomly.

      • Matthew Marler, I’m quite sure you understand the whole business better than I do. 🙂

        I’m enjoying the deeper insights of your posts.

        • Thank you for the compliment. I am mindful of the fact that you did the actual work here. Good for you!

    • As I said in an earlier comment, Soden and Held’s postulated water vapor feedback mechanism is central to the theory that additional warming at the earth’s surface caused by adding CO2 to the atmosphere can be amplified, over time, from the +1 to +1.5C direct effect of CO2 into as much as +6C of total warming.

      Because it is impossible at the current state of science to directly observe this postulated mechanism operating in real time inside of the earth’s climate system, the mechanism’s presence must be inferred from other kinds of observations. One important source of the ‘observations’ used to characterize and quantify the Soden-Held feedback mechanism is the output generated from the IPCC’s climate models, a.k.a. the GCM’s.

      The use of GCM outputs in assessing the climate’s sensitivity to the addition of CO2 over time has the effect of transforming the GCM’s into real-time climate data collection and measurement tools which operate at discrete points in the future.

      Because the outputs of the GCM’s are being used as data collection and measurement tools in assessing the climate’s response to adding CO2 — albeit at discrete points in the future — and because the GCM’s are being supported by government-funded research and academic organizations, it is perfectly reasonable to expect that the GCM’s should be subject to the guidelines of NIST Technical Note 1297
      1994 Edition, Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results.

      Two questions therefore arise: 1) are the GCM’s generally in conformance with the NIST Technical Note 1297 guidelines; and 2) could a version of Pat Frank’s paper be written which directly uses the NIST guidelines as a basis for assessing the credibility and the applicability of current GCM’s as data collection and measurement tool in the prediction of future climate states?

      • “Soden and Held’s postulated water vapor feedback mechanism is central to the theory that additional warming at the earth’s surface”
        They didn’t invent it. Arrhenius 1896 had about the same degree of enhancement due to water vapor.

  65. Miscolczi’s work just doesn’t seem to go away.

    https://rclutz.wordpress.com/2017/05/17/the-curious-case-of-dr-miskolczi/

    All the troughers with one voice vehemently trashed the research of Ferenc Miskolczi, which said there is no CO2 greenhouse effect, threatening the wonderfully lucrative climate scam. Even lukewarm voices such as Judith Curry joined in rejecting Miskolczi.

    Note that all the rebuttal narrative against the Hungarian scientist never once quotes his actual words. Even Andrew Lacis who has built a career on viscious and voluminous ad hominem attacks on Miskolczi, never once quotes his own words. It’s all indirect handwaving like “M doesn’t understand this law, M incorrectly applies this equation…” Expertly imprecise and impossible to pin down. You can sense the racist loathing as he writes.

    Miskolczi’s story reads like a book. Looking at a series of differential equations for the greenhouse effect, he noticed the solution — originally done in 1922 by Arthur Milne, but still used by climate researchers today — ignored boundary conditions by assuming an “infinitely thick” atmosphere. Similar assumptions are common when solving differential equations; they simplify the calculations and often result in a result that still very closely matches reality. But not always.

    So Miskolczi re-derived the solution, this time using the proper boundary conditions for an atmosphere that is not infinite. His result included a new term, which acts as a negative feedback to counter the positive forcing. At low levels, the new term means a small difference … but as greenhouse gases rise, the negative feedback predominates, forcing values back down.

    When Miskolczi later informed the group at NASA there that he had more important results, they finally understood the whole story, and tried to withhold Miskolczi’s further material from publication. His boss for example, sat at Ferenc’s computer, logged in with Ferenc`s password, and canceled a recently submitted paper from a high-reputation journal as if Ferenc had withdrawn it himself. That was the reason that Ferenc finally resigned from his ($US 90,000 /year) job.

  66. And as more data comes in, it supports Miskolczi:

    https://www.kiwithinker.com/2014/10/an-empirical-look-at-recent-trends-in-the-greenhouse-effect/

    https://www.kiwithinker.com/wp-content/uploads/2014/10/OLWIR-Temp-and-SB.jpg

    Yes – outgoing longwave is increasing with atmospheric temperature in accordance with Miskolczi’s predictions.

    CERES shows no net change in atmospheric radiation budget – check out slide 5.

    http://www.mpimet.mpg.de/fileadmin/atmosphaere/WCRP_Grand_Challenge_Workshop/presentations/GC_Loeb.pdf

    The Miskolczi story is not over.

  67. Pat, you will probably recall that back on your first thread https://wattsupwiththat.com/2019/09/07/propagation-of-error-and-the-reliability-of-global-air-temperature-projections-mark-ii/ I disputed the solidity of your error propagation and asked you if your theory was falsifiable, but then I went away on holiday so missed some fun and games since then. I should now like to return to this point.

    You wrote in reply “the only way to falsify a physical error analysis that indicates wide model uncertainty bounds, is to show highly accuracy for the models”. I am afraid that that remark addresses one input to your equations, namely the model accuracy (or at least variance if it is a biassed model, since low variance can still be associated with high error in that case). But it does not address the equations themselves and the effect of assumptions about covariance matrices.

    Recall that I wrote an equation

    T_i(t) = T_i(t-1) + d_i(t) + e_i(t)

    but since you have explained that it is forcing (which I’ll call F) which is iterated we should instead write

    F_i(t) = F_i(t-1) + d_i(t) + e_i(t)

    and then T_i is derived from that. I gave examples of how d_i(t), a non-stochastic component, and e_i(t), a stochastic component, could combine to give the final F_i(T) a low uncertainty. In your reply to me on the meaning of “uncertainty”, you wrote:

    “The model then is known to predict an observable as a mean plus or minus an interval of uncertainty about that mean revealed by the now-known calibration accuracy width.”

    The observable here is F_i(t), so the uncertainty is a statistical confidence interval for its value (with a normal distribution that would be something like mean +/- 2 standard deviations). So the question is how can we falsify your prediction of the uncertainty interval, in a scientific manner? I suggest to you the following.

    You have a known calibration accuracy width, which is a standard deviation s (if there are multiple parameters then s will be the square root of a covariance matrix). Then after n time steps the model is known to predict a value m +/- p(s,n) where p is the function which projects the uncertainty. I believe you are using p(s,n) = s sqrt(n). The fact that for each model multiple calibration experiments were possible in order to estimate s means that multiple predictive experiments can be done to test whether p(s,n) is correct. Specifically:

    Choose n, say 4 (years).
    Choose a replication number m, say 100.
    Run the model m times (you may need to ask the model owner to do this on your behalf), recording x_1,…,x_m. Let

    a = sum_i x_i/m, v = sum_i (x_i – a)^2/(m p(s,n)^2).

    Test the value of v using statistics.

    v should be distributed as chi-squared with m-1 degrees of freedom, and its expected value is 1.
    If v is outside the 95% confidence interval for that chi-squared, then your model will have been falsified.

    I suggested a fairly small n there, 4, and it is possible that your theory will just survive that test. But try n=25 for a keener test.

    Do you see anything wrong with that proposal for an experiment?

    Rich.

    • You are missing the point. “Calibration accuracy” is not a statistical calculation with a standard deviation. There is no population driven data set determining a mean and standard deviation. It is merely an interval where any value within it has equal probability of occurring. That is why it is “uncertain”. That is why it propagates rather than converging to a mean.

      • Jim,

        I am afraid that your comment contradicts Pat’s words which I quoted:

        “The model then is known to predict an observable as a mean plus or minus an interval of uncertainty about that mean revealed by the now-known calibration accuracy width.”

        So I’ll await Pat’s reply.

        • I’m afraid your quote leaves out a bit of context, Rich.

          Here’s what I wrote, with the missing sentence added back:

          The model then is known to predict an observable as a mean plus or minus an interval of uncertainty about that mean revealed by the now-known calibration accuracy width.

          That width provides the uncertainty when the model is used to predict a future state..

          Changes things a bit from what you implied, doesn’t it.

          Jim Gorman has it exactly right. His explanation didn’t contradict my words at all.

    • Rich, if you want to assess my work, you should refer to the equation I used. Not make something up, and then compose a straw man argument.

      It should be obvious that the (delta)F_i in eqn. 1 of the paper are the IPCC’s SRES and RCP forcings. They are givens, and have no error or uncertainty.

      Your first equation was irrelevant, and so is your second. T is derived as my equation 1. Not from anything you might invent.

      You wrote, “The observable here is F_i(t)” No, it is not. There are in fact no observables. There are predictions and their uncertainties.

      You wrote, “So the question is how can we falsify your prediction of the uncertainty interval,…

      It’s not my uncertainty interval. It comes from Lauer and Hamilton, 2013. If you want to falsify it, then show that their work is wrong.

      You wrote, “If v is outside the 95% confidence interval for that chi-squared, then your model will have been falsified.

      Not correct. The (+/-)4 W/m^2 calibration error statistic represents the average of 27 models run across 20 years of hindcast. Each model will have its own calibration error, probably unique to itself. It is unlikely that any one model will display the same calibration error as the average of 27 of them.

      So, your experiment tells us nothing.

      In any case, the calibration error statistic was derived doing exactly the experiment you described: models runs, followed by comparisons against observations. It’s an empirical statistic.

      The fact that the experiment you wanted produced the calibration statistic I used, effectively falsifies your argument on your own grounds.

      You have yet to grasp the nettle of my analysis, Rich. Everything you’ve tried has been totally misconceived.

      • Pat, I shall reply below to your points, which I annotate with P:, and then annotate my own as R:.

        P: Rich, if you want to assess my work, you should refer to the equation I used. Not make something up, and then compose a straw man argument.

        R: The reason I am “making things up” is that I am trying to see if there is an error regime which can explain your observations about the models and yet have a smaller propagation error than you claim. This is a legitimate activity, but is definitely a work in progress.

        P: It should be obvious that the (delta)F_i in eqn. 1 of the paper are the IPCC’s SRES and RCP forcings. They are givens, and have no error or uncertainty.

        R: OK, in a paper which I am writing I am using F to cover all forcings, so mistakenly used that letter here, since you are using it only for CO2 forcing. The variable at issue here is the Total Cloud Forcing, which you rightly say gives rise to errors in the models. So let’s replace ‘F’ in that equation by ‘C’ to denote this cloud forcing variable. It is up to me to see if I can make anything meaningful out of that.

        P: Your first equation was irrelevant, and so is your second. T is derived as my equation 1. Not from anything you might invent.

        R: Then your T is a deterministic mean and has no information about error or uncertainty. I shall ponder on how to deal with that – to do statistics one needs error distributions, and they must be lurking somewhere.

        P: You wrote, “The observable here is F_i(t)” No, it is not. There are in fact no observables. There are predictions and their uncertainties.

        R: No comment at this time.

        P: You wrote, “So the question is how can we falsify your prediction of the uncertainty interval,…”
        It’s not my uncertainty interval. It comes from Lauer and Hamilton, 2013. If you want to falsify it, then show that their work is wrong.

        R: I currently have no qualms about the 4 W/m^2 TCF error which seems to come from their paper. I had hoped that I had made it clear that it is the propagation of the errors over time which concerns me, where you cite Bevington & Robinson (2003). My fear is that your analysis makes unfounded assumptions about error structure. But I cannot prove that yet.

        P: You wrote, “If v is outside the 95% confidence interval for that chi-squared, then your model will have been falsified.”
        Not correct. The (+/-)4 W/m^2 calibration error statistic represents the average of 27 models run across 20 years of hindcast. Each model will have its own calibration error, probably unique to itself. It is unlikely that any one model will display the same calibration error as the average of 27 of them.

        R: In that case I have to replace “run the model” with “run the 27 models”. That would be a lot of work. But do you accept that your theory predicts that multiple runs of those models will give ensemble means which vary widely? And therefore a test of that prediction can be devised? I believe that is the crux of the matter.

        P: So, your experiment tells us nothing.
        In any case, the calibration error statistic was derived doing exactly the experiment you described: models runs, followed by comparisons against observations. It’s an empirical statistic.
        The fact that the experiment you wanted produced the calibration statistic I used, effectively falsifies your argument on your own grounds.

        R: I don’t believe those model runs ran 20 years into the future and found uncertainty intervals of 4*sqrt(20) W/m^2 with concomitant effect on model temperatures. But you have studied them more than I, so perhaps you will correct me.

        P: You have yet to grasp the nettle of my analysis, Rich. Everything you’ve tried has been totally misconceived.

        R: Perhaps. Or maybe you have yet to grasp the nettle of my objections 🙂 The next thing I am going to think about is whether statistics from a single model, rather than the 27, can provide useful information about the error propagation, and also follow some of your links about such in your helpful comment below (Sep21 12:05pm).

        Rich.

        • Rich, you wrote, “The variable at issue here is the Total Cloud Forcing, which you rightly say gives rise to errors in the models.

          Actually, it’s approximately the other way around. (Theory)-error in the models gives rise to incorrectly simulated total cloud fraction (not forcing).

          You wrote, “R: Then your T is a deterministic mean and has no information about error or uncertainty. I shall ponder on how to deal with that – to do statistics one needs error distributions, and they must be lurking somewhere.

          Try reading paper sections, CMIP5 Model Calibration Error in Global Average Annual Total Cloud Fraction (TCF) and A Lower Limit of Uncertainty in the Modeled Global Average Annual Thermal Energy Flux

          They’ll tell you where the uncertainty comes from.

          You wrote, “I currently have no qualms about the 4 W/m^2 TCF error which seems to come from their (Lauer and Hamilton, 2013) paper.

          It’s not error, Rich. It’s uncertainty. And it’s not 4 W/m^2, it’s (+/-)4 W/m^2.

          These two distinctions are central to my analysis. Virtually all of my critics have ignored, misunderstood, or avoided them.

          You wrote, “But do you accept that your theory predicts that multiple runs of those models will give ensemble means which vary widely?

          No, because it does no such thing. Uncertainty is about reliability, not specific outcomes or errors. Uncertainty says that even if all the models produced identical air temperature projections, the uncertainty in them would remain unchanged.

          You wrote, “I don’t believe those model runs ran 20 years into the future and found uncertainty intervals of 4*sqrt(20) W/m^2 with concomitant effect on model temperatures.

          The test runs were 20-year hindcasts, not projections. Models are tuned to reproduce known air temperatures. Their inter-model correspondence is no surprise. It is put in by hand.

          See J. T Kiehl (2007) Twentieth century climate model response and climate sensitivity GRL 34(22), L22710.

          Uncertainty doesn’t specify a spread of model simulated air temperatures. It specifies whether they are reliable predictions.

          Please look at the extract from Kline, 1985 in the selections on uncertainty analysis I posted in this thread, here

          You wrote, “Or maybe you have yet to grasp the nettle of my objections

          So far, Rich, your objections have centered on inventions.

          • Pat, I have started to study your helpful screed below, and will tend to make short piecemeal comments rather than save everything up.

            First off, you have ticked me off a few times for not using +/-, but for me whenever I say that an error or uncertainty bound is X I always mean +/-X to be understood. Will have to beg forgiveness for that.

            Now the enlightening thing about that screed is that it highlights a semantic problem between scientists/engineers and statisticians (and I am the latter). A statistician talks in terms of random variables and random variates. So the result of an experiment, a priori, is X which is a random variable, often assumed to be a normal random variable with mean m and some variance s^2. Then a posteriori, after the experiment has been performed, a value x has been observed, and this is a random variate.

            Now assume that m is actually known, which might be the case for some calibration experiments. Then a statistician would call X-m the unknown random error and call x-m the observed error. But uncertainty wallahs would apparently call them “uncertainty” and “error” respectively.

            I think this can at least explain the occasional talking at cross purposes. Will continue thinking, especially about propagation over time.

          • Rich, by “screed” do you mean the extracts from the literature? If so, that’s an unusual usage.

            Second, I regret ticking you off, but I can’t have surmised what you meant when it was never specified.

            If you’ve followed the debate, here and elsewhere, Nick Stokes, ATTP, and others have argued that rmse is always a positive magnitude.

            This is always in the larger context of their claim that all GCM error is a mere offset bias that, known or unknown, always subtracts away.

            This nonsense gives them call to assert that all GCM simulation anomalies are perfectly accurate.

            The same false assertion of constant bias error is a kind of folk-belief among climate modelers. I’ve encountered it a number of times among my reviewers.

            So, I’ve learned to be very careful when someone leaves off the (+/-), because that has typically been the opening gambit of asserting that all GCM simulation errors are positive offset errors that subtract away into predictive perfection.

            You apparently did not intend that meaning. I regret not knowing that, and upsetting you.

            But around here, if you write a rmse using the positive value convention, many are likely abuse it as a sign that you too believe that (+/-) = +, and go on to argue perfectly accurate anomalies. Please be careful. Openings get exploited.

            One element of experimental science to keep in mind is that systematic errors, especially if stemming from uncontrolled variables, or from within deficient theory in non-linear models, are unlikely to be normally distributed. And often of unpredictable distribution, which is again different from randomly distributed.

            The messiness of the real world always intrudes.

  68. For the benefit of all, I’ve put together an extensive post that provides quotes, citations, and URLs for a variety of papers — mostly from engineering journals, but I do encourage everyone to closely examine Vasquez and Whiting — that discuss error analysis, the meaning of uncertainty, uncertainty analysis, and the mathematics of uncertainty propagation.

    These papers utterly support the error analysis in “Propagation of Error and the Reliability of Global Air Temperature Projections.”

    Summarizing: Uncertainty is a measure of ignorance. It is derived from calibration experiments.

    Multiple uncertainties propagate as root sum square. Root-sum-square has positive and negative roots (+/-). Never anything else, unless one wants to consider the uncertainty absolute value.

    Uncertainty is an ignorance width. It is not an energy. It does not affect energy balance. It has no influence on TOA energy or any other magnitude in a simulation, or any part of a simulation, period.

    Uncertainty does not imply that models should vary from run to run, Nor does it imply inter-model variation. Nor does it necessitate lack of TOA balance in a climate model.

    For those who are scientists and who insist that uncertainty is an energy and influences model behavior (none of you will be engineers), or that a (+/-)uncertainty is a constant offset, I wish you a lot of good luck because you’ll not get anywhere.

    For the deep-thinking numerical modelers who think rmse = constant offset or is a correlation: you’re wrong.

    The literature follows:

    Moffat RJ. Contributions to the Theory of Single-Sample Uncertainty Analysis. Journal of Fluids Engineering. 1982;104(2):250-8.

    Uncertainty Analysis is the prediction of the uncertainty interval which should be associated with an experimental result, based on observations of the scatter in the raw data used in calculating the result.

    Real processes are affected by more variables than the experimenters wish to acknowledge. A general representation is given in equation (1), which shows a result, R, as a function of a long list of real variables. Some of these are under the direct control of the experimenter, some are under indirect control, some are observed but not controlled, and some are not even observed.

    R=R(x_1,x_2,x_3,x_4,x_5,x_6, . . . ,x_N)

    It should be apparent by now that the uncertainty in a measurement has no single value which is appropriate for all uses. The uncertainty in a measured result can take on many different values, depending on what terms are included. Each different value corresponds to a different replication level, and each would be appropriate for describing the uncertainty associated with some particular measurement sequence.

    The Basic Mathematical Forms

    The uncertainty estimates, dx_i or dx_i/x_i in this presentation, are based, not upon the present single-sample data set, but upon a previous series of observations (perhaps as many as 30 independent readings) … In a wide-ranging experiment, these uncertainties must be examined over the whole range, to guard against singular behavior at some points.

    Absolute Uncertainty

    x_i = (x_i)_avg (+/-)dx_i

    Relative Uncertainty

    x_i = (x_i)_avg (+/-)dx_i/x_i

    Uncertainty intervals throughout are calculated as (+/-)sqrt[(sum over (error)^2].

    The uncertainty analysis allows the researcher to anticipate the scatter in the experiment, at different replication levels, based on present understanding of the system.

    The calculated value dR_0 represents the minimum uncertainty in R which could be obtained. If the process were entirely steady, the results of repeated trials would lie within (+/-)dR_0 of their mean …”

    Nth Order Uncertainty

    The calculated value of dR_N, the Nth order uncertainty, estimates the scatter in R which could be expected with the apparatus at hand if, for each observation, every instrument were exchanged for another unit of the same type. This estimates the effect upon R of the (unknown) calibration of each instrument, in addition to the first-order component. The Nth order calculations allow studies from one experiment to be compared with those from another ostensibly similar one, or with “true” values.

    Here replace, “instrument” with ‘climate model.’ The relevance is immediately obvious. An Nth order GCM calibration experiment averages the expected uncertainty from N models and allows comparison of the results of one model run with another in the sense that the reliability of their predictions can be evaluated against the general dR_N.

    Continuing: “The Nth order uncertainty calculation must be used wherever the absolute accuracy of the experiment is to be discussed. First order will suffice to describe scatter on repeated trials, and will help in developing an experiment, but Nth order must be invoked whenever one experiment is to be compared with another, with computation, analysis, or with the “truth.”

    Nth order uncertainty, “

    *Includes instrument calibration uncertainty, as well as unsteadiness and interpolation.
    *Useful for reporting results and assessing the significance of differences between results from different experiment and between computation and experiment.

    The basic combinatorial equation is the Root-Sum-Square:

    dR = sqrt[sum over((dR_i/dx_i)*dx_i)^2]

    https://doi.org/10.1115/1.3241818

    Moffat RJ. Describing the uncertainties in experimental results. Experimental Thermal and Fluid Science. 1988;1(1):3-17.

    The error in a measurement is usually defined as the difference between its true value and the measured value. … The term “uncertainty” is used to refer to “a possible value that an error may have.” … The term “uncertainty analysis” refers to the process of estimating how great an effect the uncertainties in the individual measurements have on the calculated result.

    THE BASIC MATHEMATICS

    This section introduces the root-sum-square (RSS) combination (my bold), the basic form used for combining uncertainty contributions in both single-sample and multiple-sample analyses. In this section, the term dX_i refers to the uncertainty in X_i in a general and nonspecific way: whatever is being dealt with at the moment (for example, fixed errors, random errors, or uncertainties).

    Describing One Variable

    Consider a variable X_i, which has a known uncertainty dX_i. The form for representing this variable and its uncertainty is

    X=X_i(measured) (+/-)dX_i (20:1)

    This statement should be interpreted to mean the following:
    * The best estimate of X, is X_i (measured)
    * There is an uncertainty in X_i that may be as large as (+/-)dX_i
    * The odds are 20 to 1 against the uncertainty of X_i being larger than (+/-)dX_i.

    The value of dX_i represents 2-sigma for a single-sample analysis, where sigma is the standard deviation of the population of possible measurements from which the single sample X_i was taken.

    The uncertainty (+/-)dX_i Moffat described, exactly represents the (+/-)4W/m^2 LWCF calibration error statistic derived from the combined individual model errors in the test simulations of 27 CMIP5 climate models.

    For multiple-sample experiments, dX_i can have three meanings. It may represent tS_(N)/(sqrtN) for random error components, where S_(N) is the standard deviation of the set of N observations used to calculate the mean value (X_i)_bar and t is the Student’s t-statistic appropriate for the number of samples N and the confidence level desired. It may represent the bias limit for fixed errors (this interpretation implicitly requires that the bias limit be estimated at 20:1 odds). Finally, dX_i may represent U_95, the overall uncertainty in X_i.

    From the “basic mathematics” section above, the over-all uncertainty U = root-sum-square = sqrt[sum over((+/-)dX_i)^2] = the root-sum-square of errors (rmse). That is U = sqrt[(sum over(+/-)dX_i)^2] = (+/-)rmse.

    The result R of the experiment is assumed to be calculated from a set of measurements using a data interpretation program (by hand or by computer) represented by

    R = R(X_1,X_2,X_3,…, X_N)

    The objective is to express the uncertainty in the calculated result at the same odds as were used in estimating the uncertainties in the measurements.

    The effect of the uncertainty in a single measurement on the calculated result, if only that one measurement were in error would be

    dR_x_i = (dR/dX_i)*dX_i)

    When several independent variables are used in the function R, the individual terms are combined by a root-sum-square method.

    dR = sqrt[sum over(dR/dX_i)*dX_i)^2]

    This is the basic equation of uncertainty analysis. Each term represents the contribution made by the uncertainty in one variable, dX_i, to the overall uncertainty in the result, dR.

    http://www.sciencedirect.com/science/article/pii/089417778890043X

    Vasquez VR, Whiting WB. Accounting for Both Random Errors and Systematic Errors in Uncertainty Propagation Analysis of Computer Models Involving Experimental Measurements with Monte Carlo Methods. Risk Analysis. 2006;25(6):1669-81.

    [S]ystematic errors are associated with calibration bias in the methods and equipment used to obtain the properties. Experimentalists have paid significant attention to the effect of random errors on uncertainty propagation in chemical and physical property estimation. However, even though the concept of systematic error is clear, there is a surprising paucity of methodologies to deal with the propagation analysis of systematic errors. The effect of the latter can be more significant than usually expected.

    Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources.

    Of particular interest are the effects of possible calibration errors in experimental measurements. The results are analyzed through the use of cumulative probability distributions (cdf) for the output variables of the model.”

    A good general definition of systematic uncertainty is the difference between the observed mean and the true value.”

    Also, when dealing with systematic errors we found from experimental evidence that in most of the cases it is not practical to define constant bias backgrounds. As noted by Vasquez and Whiting (1998) in the analysis of thermodynamic data, the systematic errors detected are not constant and tend to be a function of the magnitude of the variables measured.”

    Additionally, random errors can cause other types of bias effects on output variables of computer models. For example, Faber et al. (1995a, 1995b) pointed out that random errors produce skewed distributions of estimated quantities in nonlinear models. Only for linear transformation of the data will the random errors cancel out.”

    Although the mean of the cdf for the random errors is a good estimate for the unknown true value of the output variable from the probabilistic standpoint, this is not the case for the cdf obtained for the systematic effects, where any value on that distribution can be the unknown true. The knowledge of the cdf width in the case of systematic errors becomes very important for decision making (even more so than for the case of random error effects) because of the difficulty in estimating which is the unknown true output value. (emphasisi in original)”

    It is important to note that when dealing with nonlinear models, equations such as Equation (2) will not estimate appropriately the effect of combined errors because of the nonlinear transformations performed by the model.

    Equation (2) is the standard uncertainty propagation sqrt[sum over(±sys error statistic)^2].

    In principle, under well-designed experiments, with appropriate measurement techniques, one can expect that the mean reported for a given experimental condition corresponds truly to the physical mean of such condition, but unfortunately this is not the case under the presence of unaccounted systematic errors.

    When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

    beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i. Similarly, the same approach is used to define a total random error based on individual standard deviation estimates,

    e_k = sqrt[sum over(sigma_R_i)^2]

    A similar approach for including both random and bias errors in one fterm is presented by Deitrich (1991) with minor variations, from a conceptual standpoint, from the one presented by ANSI/ASME (1998)

    http://dx.doi.org/10.1111/j.1539-6924.2005.00704.x

    Kline SJ. The Purposes of Uncertainty Analysis. Journal of Fluids Engineering. 1985;107(2):153-60.

    The Concept of Uncertainty

    Since no measurement is perfectly accurate, means for describing inaccuracies are needed. It is now generally agreed that the appropriate concept for expressing inaccuracies is an “uncertainty” and that the value should be provided by an “uncertainty analysis.”

    An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable.

    The term “calibration experiment” is used in this paper to denote an experiment which: (i) calibrates an instrument or a thermophysical property against established standards; (ii) measures the desired output directly as a measurand so that propagation of uncertainty is unnecessary.

    The information transmitted from calibration experiments into a complete engineering experiment on engineering systems or a record experiment on engineering research needs to be in a form that can be used in appropriate propagation processes (my bold). … Uncertainty analysis is the sine qua non for record experiments and for systematic reduction of errors in experimental work.

    Uncertainty analysis is … an additional powerful cross-check and procedure for ensuring that requisite accuracy is actually obtained with minimum cost and time.

    Propagation of Uncertainties Into Results

    In calibration experiments, one measures the desired result directly. No problem of propagation of uncertainty then arises; we have the desired results in hand once we complete measurements. In nearly all other experiments, it is necessary to compute the uncertainty in the results from the estimates of uncertainty in the measurands. This computation process is called “propagation of uncertainty.”

    Let R be a result computed from n measurands x_1, … x_n„ and W denotes an uncertainty with the subscript indicating the variable. Then, in dimensional form, we obtain: (W_R = sqrt[sum over(error_i)^2]).”

    https://doi.org/10.1115/1.3242449

    Henrion M, Fischhoff B. Assessing uncertainty in physical constants. American Journal of Physics. 1986;54(9):791-8.

    “Error” is the actual difference between a measurement and the value of the quantity it is intended to measure, and is generally unknown at the time of measurement. “Uncertainty” is a scientist’s assessment of the probably magnitude of that error.

    https://aapt.scitation.org/doi/abs/10.1119/1.14447

  69. I think Stokes is right. The toy model loses it. CMIP5 does not. And if it does, it’s fixed. But you aren’t talking of the math of a CMIP5. You are talking about your toy model. And I think you are pulling out 1 variable of maybe 20 that they use and attacking that one in your toy model.

    A CMIP5 has maybe 20 things. A toy model has 3 or whatever it is. When we take 1 of those 20 things and put into the toy model, that model doesn’t work. I don’t care.

    The CMIP5 has a deal. It’s something to do with the TOA. It keeps the CMIP5 bounded. They all have it and it works. And when it doesn’t work, it’s fixed. The toy model doesn’t have this same deal.

    That the CMIP5 is bounded means its problem has been fixed. I can say a CMIP5 has problems. But then I need to demonstrate it. With results. And the CMIP5s did have problems, but the one at issue has been fixed. And now it gives results.

    Yet the problem identified should at least show up in some kind of distribution in all of the CMIP5 results. Because whatever math problem you found, should always be there, in all of the results, though maybe in a bell curve distribution. But it could just cancel out like has been suggested.

    But if a simple math case is solid as suggested, then it lives in everything at issue. If it’s true, it’s there. Now where is it?

      • I am saying if this error exists, we should see it in the CMIP5 outputs. And as a warm bias only half the time as I understand it. And a cold bias the other half.

        • You’d be wrong, Ragnaar. Like so many others, you think uncertainty is error. It is not.

          Uncertainty does not specify a range of model simulated air temperatures.

          Uncertainty in projected temperature, i.e., (+/-)C, does not imply that the projection should sometimes have positive temperature error and sometimes negative.

          Your entire approach to the problem of uncertainty is incorrect.

          Please look at the extract from Kline, 1985 in the set of extracts I posted above, here.

          Second paragraph, first line is, “An uncertainty is not the same as an error.”

          As I mentioned to Rich above, CMIP5 models (like all others) are tuned to produce the historical temperature trends. Their inter-model consistency is put in by hand.

          See J. T Kiehl (2007) Twentieth century climate model response and climate sensitivity GRL 34(22), L22710.

        • Ragnaar: I am saying if this error exists, we should see it in the CMIP5 outputs. And as a warm bias only half the time as I understand it. And a cold bias the other half.

          And I have written that you are wrong on both counts.

          There is an ambiguity, perhaps, about which “time interval” you are halving. There is no reason to think that the error is + for half of the GCM runs and – for the other half.

    • Ragnaar,

      “The toy model loses it. CMIP5 does not. And if it does, it’s fixed. But you aren’t talking of the math of a CMIP5. You are talking about your toy model. And I think you are pulling out 1 variable of maybe 20 that they use and attacking that one in your toy model.”

      If that one variable provides the same output from the “toy” model as does the CIMP5 for temperature prediction then which is the better model for temperature prediction? Frank’s wasn’t attacking the variables used per se but the uncertainty in the output of the models. It doesn’t matter if the models use 20 variables or one variable when both produce the same linear output for temperature!

      You sound a little bit jealous over the climate models to me. Their complexity is not a virtue if they are nothing more than generators of a linear relationship.

    • Uncertainty is not bounded, Ragnaar.

      Propagated calculational uncertainty can grow well beyond the limits of a bounded system. When it does so, it means the model expectation values have no information; no physical meaning.

      A lower limit of resolution defines the limit of model reliability. One needs only to evaluate that limit, to know whether to trust a model. Which is what I have done.

      Throughout your comment, you’ve confused error with uncertainty. A Fatal mistake. Nothing of what you wrote has any relevance.

      • Because a thing is a problem in one context, your model, doesn’t mean it’s a problem in another context, a CMIP5. A CMIP5 is a system. With all things interacting and dependent on each other. With your thing that is the problem, there are other things keeping it in check in a CMIP5. I am making a few assumptions here. When you don’t keep it in check, it is a problem as you are saying.

        Here’s what I think would help you. Put it into a picture story. Your story exists beyond almost everyone’s understanding. I am trying to tell a story about it. You could even put my attempt into a picture story, to show me why I am wrong.

        Here’s another attempt. My pick-up works. Someone is telling me it shouldn’t and should waver all over the road going half into the other lane every 30 seconds. But it works fine. My pick-up has errors as it’s 20 years old. The steering aint what it used to be. My best proof is not responding to whomever is saying my pick-up doesn’t work and just driving around as I normally do. So you need some crashes.

        • Your approach to the problem remains wrong, Ragnaar. It’s not about error, it’s about uncertainty.

          It’s not a question of your pick-up wandering around on the road.

          It’s a question of whether it will get you to the town 80 km away, given that the mechanic says your transmission is failing and it is presently making awful grinding noises.

          Uncertainty, Ragnaar, not error.

          • I am uncertain to what inch my left front tire is off the center line all the time. I am uncertain every second. But I am fine. The other guy who posted on this showed a story of the temperature over time. A story in pictures.

            Life exists with uncertainty. Now model it.

        • “With your thing that is the problem, there are other things keeping it in check in a CMIP5. ”

          The issue isn’t “keeping it in check”. The issue is how certain the output is. Boundary limits don’t guarantee accuracy, the uncertainty still remains.

          • So we have uncertainty that is bounded. Now this uncertainty must be assigned a value that is useful. Say it’s growing exponentially. But 90% of that growth is being thrown away by what bounds it. So now we need the effective uncertainty. If we are throwing away uncertainty, where does that leave us? Above I say my left front tire is I don’t know how many inches from the center line. But as long as things look right, I don’t care. I am throwing away uncertainty all the time and that doesn’t bother me. That there is this uncertainty, doesn’t seem to impact the system. You might say CMIP5s can’t do what I am doing. They track. That’s whats prevents them from blowing up, same with the climate. I will say it again. CMIP5s track. The spin up is an example of that. Everything that is wrong, go away. That’s tracking. So as has been said, they track CO2 too. You set the CO2, and they track it relentlessly for I suppose at least 50 years. But this propagation of uncertainty seems to have been solved so far by tracking a CO2 equilibrium. That is the CO2 sets the equilibrium. So they’ve been teaching the models to track the CO2 equilibrium and inventing a bunch of stuff, some of it questionable. I think it was said, the cloud problem has a made up assumption in it. But the whole deal still tracks. We can say there’s a total model uncertainty. That includes all the uncertainties if each one could be distilled.

          • “So we have uncertainty that is bounded. Now this uncertainty must be assigned a value that is useful. Say it’s growing exponentially. But 90% of that growth is being thrown away by what bounds it.”

            Who says the uncertainty is bounded? In an iterative process the uncertainty adds at each step. The only boundary on how much it grows is how many iterative steps are taken. It’s why a weather forecast for 24 hours ahead is more certain than one for 48 hours. The uncertainty grows over that second 24 hours. Nor does uncertainty usually grow exponentially. Did you not read the past posts on a ruler that is “about” 12 inches long, say somewhere between 11″ and 13″? In measuring a room using that ruler the first iterative step has an uncertainty of +/- 1″. The second measurement will double that, i.e. 13″ + 1″ and 11″-1″, or 10″-14″ or an uncertainty of +/- 2″ instead of one. The third iteration will be +/- 3″.

            “Above I say my left front tire is I don’t know how many inches from the center line. But as long as things look right, I don’t care.”

            How do you know things look “right”? If you don’t know where your tire is in relation to the center line then you don’t know where it is from the drop-off of the pavement on the ditch side. And if it is like every truck I have driven you can’t see the tire on the ditch side so you *should* care what the uncertainty is.

            “That there is this uncertainty, doesn’t seem to impact the system. ”

            Again, how do you know The climate modelers never tell us what the uncertainty of the model output is. So when they claim they can *know* the average temperature growth to the nearest hundredth of a degree how are we to judge if that is inside or outside the uncertainty of the projection? Since most temperature measuring devices, especially in the past, can only resolve to the nearest +/- 0.5C how can the climate models have a resolution better than this? Especially since the central limit theory doesn’t apply when you are using independent measurement devices measuring independent conditions. I hate to keep harping on the 1000 steel girder example but if you measure 1000 steel girders with 1000 different tape measures how does the central limit theory help you? The differences in the measurements are not random but, instead, are systemic. You don’t know if the tape measures were hot or cold and therefore shrunken or expanded and the same applies to the girders. The central limit theory won’t help you get an accurate average in such a case.

            “Everything that is wrong, go away. That’s tracking. ”

            Tracking what? We already know the model outputs don’t match the real world, they run hot. So *something* wrong did *not* go away.

            “But this propagation of uncertainty seems to have been solved so far by tracking a CO2 equilibrium.”

            Tracking CO2 doesn’t mean the temperature output of the models are accurate! They will still have a measure of uncertainty!

            “But the whole deal still tracks. We can say there’s a total model uncertainty. That includes all the uncertainties if each one could be distilled.”

            Again, how do you know the models track anything? If the uncertainty interval is larger than the resolution of the temperature changes they claim then you don’t know if the models are tracking or not.

          • Ragnaar, you merely find a new way to be wrong.

            Some days ago, I added a long post from the literature about the meaning of uncertainty, here. It establishes that uncertainty is not error; establishes, Ragnaar.

            You have ignored it. Fine. Be wrong as you prefer.

            Uncertainty is unbounded. It is not constrained by physical boundary conditions.

            Your arguments are wrong. Several people here have tried to show you the way, most especially Tim Gorman and Matthew Marler. You prefer to ignore their correct explanations.

            Good luck Ragnaar with the rest of your life, because you’ll not get anywhere in science.

  70. https://pubpeer.com/publications/391B1C150212A84C6051D7A2A7F119#5

    #5 Carl Wunsch
    I am listed as a reviewer, but that should not be interpreted as an endorsement of the paper. In the version that I finally agreed to, there were some interesting and useful descriptions of the behavior of climate models run in predictive mode. That is not a justification for concluding the climate signals cannot be detected! In particular, I do not recall the sentence “The unavoidable conclusion is that a temperature signal from anthropogenic CO2 emissions (if any) cannot have been, nor presently can be, evidenced in climate observables.” which I regard as a complete non sequitur and with which I disagree totally.

    The published version had numerous additions that did not appear in the last version I saw.

    I thought the version I did see raised important questions, rarely discussed, of the presence of both systematic and random walk errors in models run in predictive mode and that some discussion of these issues might be worthwhile.

    CW

    • That sentence, “The unavoidable conclusion is that a temperature signal from anthropogenic CO2 emissions (if any) cannot have been, nor presently can be, evidenced in climate observables.”

      … was in every single version of the paper CW saw.

      … which I regard as a complete non sequitur and with which I disagree totally.”

      I would like to see CW, or anyone else, show how a model can simulate the effect of a perturbation that is two orders of magnitude below the model’s lower limit of resolution.

      Given the immediately very large uncertainty bounds around projected air temperatures, it is certainly not a non-sequitur to say that a temperature signal from anthropogenic CO2 emissions (if any) cannot have been, nor presently can be, evidenced in climate observables.

      I read CW’s comment yesterday at PubPeer, and admit to a bit of shock that he would write such a disavowal of his own review.

      He may disagree, but it is nevertheless clear that the huge uncertainty following from low model resolution means inability to detect any temperature effect of CO2 emissions.

      Let’s notice, too, Anthony Banton that you neglected to mention any of my readily available replies there. Prejudiced a bit, is it? Or just careless.

        • Relevant to the way Carl Wunsch thinks, Anthony Banton.

          And relevant to those who put politics ahead of science; to those who employ fake criticism.

          Not relevant to anything in the paper.

  71. Pat Sep22 3:11pm:

    Note 2: “Screed”

    [I shall consider my Sep22 2:27pm to be Note 1: Uncertainty = random variable, error = random variate.]

    I was in danger of being a Mrs. Malaprop there (my wife sometimes complains of this), but I was helpful in prepending the adjective “helpful”. Therefore Pat realized, I think, that my “helpful screed” was not pejorative.

    The New Shorter Oxford English Dictionary Volume 2, a mere 3700-odd pages long, gives “screed” as “a long, esp. tedious, piece of writing or speech; a (dull) tract”, but includes the apparently non-pejorative example “Any news will be welcome and I will give you a screed in reply”. I wonder whether the negative connotations have become more prevalent in modern usage (there is no date for the given example).

  72. Note 3: Error and estimated error

    In Note 1 “Uncertainty = random variable, error = random variate” I wrote about a random variable X having a mean m and variance s^2, and how if m is known then an observation x of X records an error x-m. But often m is not known. In this case replication is used; n observations x_i are taken and m* = sum_1^n x_i/n, the sample mean, is used to determine “errors” x_i-m*. However, because m* is not m, these are really estimated errors, and sometimes the distinction will be important. Often the fact of estimation is understood and the “estimated” adjective is sensibly dropped for brevity. So for example the Root Mean Squared Error (RMSE) is often of the estimated type, and is then

    RMSE = sqrt(sum_1^n (x_i-m*)^2/(n-1))

    Sometimes n is used in place of n-1 there, but the latter makes MSE = RMSE^2 an unbiassed estimate of the true error variance under certain conditions.

    • “In Note 1 “Uncertainty = random variable”

      What makes you think uncertainty is a random variable? Take my voltmeter with an uncertainty of +/- 0.1v. How is that +/- 0.1v random? It doesn’t change from reading to reading. That uncertainty will remain no matter how many individual measurements I make. You can take as many sample groups of arbitrary size as you want out of those individual measurements but you won’t cancel out the uncertainty using the central limit theorem. You might develop a mean which you *think* is more accurate than any of the individual measurements but you still won’t be sure because the uncertainty remains.

      • Tim: well, suppose you are measuring a voltage which some super duper voltmeter has established is 10.0000 volts. What does it mean for you to say that your voltmeter has an uncertainty of +/- 0.1v? Does it mean that for example it is measuring 9.70 volts and the manufacturer has sworn blind and false that the true voltage is therefore between 9.6 and 9.8 volts (or a wider range if some normal distribution is involved)? Or does it mean that it read 9.93 volts and the manufacturer swore blind and true that the true voltage is between 9.83 and 10.03 volts?

        When you have clarified what that uncertainty “interval” means, it may be possible to discuss any statistical tests that might pertain.

        • See,

          Every voltmeter I have uses resistive dividers to provide for measuring different voltage levels. Those resistive dividers have uncertainty built into them because no resistor can be considered perfect. Just as no thermometer can be considered perfect. Even the new thermistors used in modern thermometers have uncertainty. They may have high resolution but that doesn’t eliminate the uncertainty. Manufacturers try to reduce the uncertainty by generating calibration curves but the uncertainty can only be reduced, not eliminated. And once the thermometer leaves the calibration shop the uncertainty grows thereafter, it never gets better.

          Analog meters have all kinds of uncertainties, e.g. static electricity on the dial face affecting needle positioning, non-linearity in the meter motor, even humidity. The fact that no meter dial can have infinitely scaled meter markings requires interpolation to obtain a measurement but this can be somewhat reduced by taking the average of multiple readings (the central limit theory). The other uncertainties can’t be accounted for since the impact of static electricity, non-linear meter motors, and etc are unknown.

          Digital meters have what is called “last digit” uncertainty. What does it take for that last digit to change from one digit to the next? A major unknown affecting the uncertainty of any reading.

          For your example the meter would not read 9.70v. It would read 9.7v. And yes, an uncertainty of +/-0.1 v would mean the reading could be between 9.6v and 9.8v inclusive. Stop and think about your significant digits. I know that concept isn’t taught much anymore, just as uncertainty isn’t. Why would a meter with a resolution of +/- 0.1v have a readout in the hundreth’s? The uncertainty would swamp anything shown on the meter in the last digit – which is the entire point of Dr. Frank’s paper on uncertainty in the climate models. How can they resolve to a hundreth of a degree when their uncertainty interval is larger than that? Precision is not accuracy. Uncertainty is not error.

          • Calibration uncertainty is an interval within which an instrument or a model can provide no data. It is not a random error.

            Calibration uncertainty is typically empirically determined. It is an empirical measure of the resolution of the method. Serial calculations employing values with appended uncertainties put a root-sum-square of those uncertainties in the final result.

            All that is amply explained in the post above, presenting extracts from the literature.

            I have no idea why it is invisible to so many.

          • Dr. Frank,

            “I have no idea why it is invisible to so many.”

            Because so many people have no actual experience with using analog methods in the field to accomplish a goal. None of them have ever had to actually worry about tolerances on a fish plate used to connect bridge girders together. Or using an analog voltmeter to determine circuit conditions in something affecting human lives, they just hook up a digital voltmeter and By God it gives you a precise and accurate value, no questions need to be asked, uncertainty is zero. Same for their digital thermometer.

            And it is endemic throughout much of science today. My PhD son doing HIV research has pointed out how so many experiments today in the biological sciences simply can’t be replicated. It’s because so many of the researchers use methods with high degrees of uncertainty but they simply don’t understand that. So they trust their results to be both precise and accurate and never a question asked! My son ignored the advice of his advisors in undergraduate studies that he didn’t need any statistics classes and took 9 hours anyway. When he combines that with his knowledge of immunology he understands how to get repeatable results with an uncertainty interval. Just because two different researchers get different results doesn’t mean either is wrong, not if they are within the range of uncertainty.

          • Tim, thanks for the clarification. I threw in the centivolt digit just to see what you would say. Now, in order for scientists to do meaningful mathematics on uncertainty, they have to formalize it into a random variable. Otherwise, uncertainties cannot be combined together in a proper manner, in the way elucidated in Pat’s paper.

            In our example, we know that the voltmeter has a bias, since it is reading low. We don’t know exactly what that bias is, but a best estimate from the single experiment is -0.3v. Then on top of that there is the distribution of its error, for which we have that single reading of -0.3v. It might be that its error is uniformly distributed in the interval (-0.4,-0.2), and such an assumption might lead to reasonable conclusions. However, given the physical nature you describe it is more likely to be a normal distribution, say N(-0.3,0.1^2). But it is still more likely that it is something like N(-0.27,0.08^2). Multiple experiments, including changing the true (or more accurately known) voltage, would help to detrmine that. Whether the more accurate information on its error distribution would be useful would depend on the application, and one would suspect that most times it would not.

            Rich.

          • Rich,

            “Now, in order for scientists to do meaningful mathematics on uncertainty, they have to formalize it into a random variable. Otherwise, uncertainties cannot be combined together in a proper manner, in the way elucidated in Pat’s paper.”

            Why is this? When using a ruler that is assumed to be 12″ long with an uncertainty of +/- 1″ the uncertainty is not random. It is a specific interval that doesn’t change. And yet the uncertainty associated with its use over iterative measurements can be calculated quite easily – and it is not random.

            I still think you are confusing error and uncertainty. You want to try and wish uncertainty away by making an unwarranted assumption that uncertainty is a random variable when it isn’t.

            “In our example, we know that the voltmeter has a bias, since it is reading low. We don’t know exactly what that bias is, but a best estimate from the single experiment is -0.3v.”

            You just jumped from uncertainty to error.

            “Then on top of that there is the distribution of its error, for which we have that single reading of -0.3v.”

            Like I said, you just jumped from uncertainty to error. An uncertainty interval doesn’t tell you what any error might be or its distribution. It just gives you an interval within which the error can exist, it doesn’t tell you the actual error.

            “Multiple experiments, including changing the true (or more accurately known) voltage, would help to detrmine that.”

            It will help determine the error distribution. It won’t change the uncertainty interval.

            “Whether the more accurate information on its error distribution would be useful would depend on the application, and one would suspect that most times it would not.”

            Error is not uncertainty. Uncertainty is not error. Stop trying to conflate the two and you’ll understand.

  73. Note 4: The +/-4 W/m^2 is an estimated RMSE for TCF

    [Mainly for my own clarification.]

    Figure 6 of Pat’s paper says “identical SRES scenarios showing the ±1σ uncertainty bars due to the annual average ±4 Wm–2 CMIP5 TCF long-wave tropospheric thermal flux calibration error…”, where TCF is Total Cloud Forcing.

    This estimated RMSE arises from statistical analysis of CMIP5 models. It appears from Lauer & Hamilton (2003) that that comes from 24 models times 20 annual observations.

  74. Note 5: An error in Equation (5)

    In Pat’s paper, it appears that Equation (5) is incorrectly derived from Equation (1). In showing this I am first going to simplify the equations in three ways. First, remove the Delta from Delta-T, because it is comparing time t to time 0 and is therefore confusing in comparison with Delta-F which compares time i to time i-1. Second, remove the K’s, as we know what units we are working in. Third, combine f_CO_2 = 0.42 and the 33 into their product 13.86. So now we have

    T_t = 13.86(1+sum_{i=1}^t Delta F_i/F_0) + a (1)

    Now, the only way to get to a single Delta F_i from this is to subtract a (t-1)-fold sum from a t-fold sum, thus:

    T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)

    Imagine writing i instead of t there, and then compare with Equation (5.2) as per my simplification:

    T_i +/- u_i = 13.86 (1+Delta F_i/F_0) +/- 1.665 (5.2)

    where 1.665 = 0.42*33*4/33.30.

    I am not sure whether or how this affects the validity of the propagation analysis, which effectively uses 1.665*sqrt(t) as its output, but this mismatch between (5.2) and (5.R) is disturbing.

    • See-owe to Rich: this mismatch between (5.2) and (5.R)

      It looks to me like you have found an actual error. I can’t see where it affects the uncertainty propagation analysis. Good job. Several dozen minds must have missed that error by now.

      Incidentally, I have reread not just the paper (I wonder how many remaining errors I have missed!), and the supplementary information, which quotes reviewers’ criticisms and responds to them. It is well worth rereading.

      • Matthew, there is an actual error. It is Rich’s error.

        His mistake is always the same one. He invariably imposes his mistaken understanding on what I wrote.

        It’s very tedious.

        There is no mistake in eqn. 5, nor in eqn. 1. Indeed eqn. 5 follows directly and obviously from eqn. 1.

      • Further, Rich is not calculating a (delta)T.

        His subtraction is (delta)T_i – (delta)T_(i-1) = (delta-delta)T_i-(i-1). “i” can be “t;” it doesn’t matter. The equations are always (delta)T.

        The final difference written correctly is, (delta-delta)T_(i – (i-1)) = 13.86*[(delta-delta)F_i-(i-1))]/F_0.

        Eqn 5.2 (R) is incorrect.

        Is that disturbing?

        • Pat Frank: Is that disturbing?

          It looks like in eqn 5, deltaT,(K) is supposed to be the uncertainty in the temperature increase from time i-1 to time i, which is proportional to the change in forcing from time i-1 to time i. That is why the uncertainty has only the single term [0.42 x etc]. If that is the case, then you have forgotten to subtract F0 from the sum of the forcings in eqn1.

          About eqn 1 you write In Equation 1, delta Tt is the total change of air temperature in
          Kelvins across projection time t,

          and F0 + sumoni deltaFi is appropriate.

          In eqn 5 sumoni deltaFi has been replaced by deltaFi. Furthermore about eqn 5 you write belowq figure 8 that The impact of a 0.035 Wm^2 annual forcing change on cloud cover due to increased CO2 cannot be resolved, or simulated by, climate models that have a +/- 4 Wm^2 resolution lower limit . That seems to imply that eqn 5 describes only the annual temperature change caused by an annual forcing change. That interpretation is consistent with the uncertainty in eqn 5, which is the uncertainty of a 1 year temp change, not the uncertainty accumulated over K years.

          You also wrote before eqn 5 that: The CMIP5 average annual LWCF +/- 4.0 Wm^2 [per year]
          calibration thermal flux error is now combined with the thermal flux due to GHG emissions in emulation equation 1, to produce equation 5. This will provide an estimate of the uncertainty in any tropospheric global air temperature projection made using a CMIP5 GCM. In equation 5 the step-wise GHG forcing term, deltaFi, is conditioned by the uncertainty in thermal flux in every step due to the continual imposition of LWCF thermal flux calibration error.

          that makes it seem that deltaTi(K) is the change on the last step to the final K.

          So, the simplest explanation is that you changed the meaning of delta Ti(K) and forgot to drop F0 from the difference in forcing.

          It took a good mind to discern that the symbol delta Ti(K) is perfectly clear in each case, but has changed its meaning (if that is what occurred) from eqn 1 to eqn 2. If that is the notational detail that Nick Stokes complained of, he certainly wrote it most obscurely.

          Also notice that the uncertainty in eqn 1 can now be written as:

          a^2 = sum on i ui^2.

          Notice also that, conditional on the error in the cloud feedback parameter, eqn 1 does not describe a random walk.

          You wrote this as well: Equation 5 gives the change in temperature over a single annual step. Hence the index “i,” which you have removed. Notice: no Greek capital. No summation.

          That was my interpretation at the start. If that is true, then you have forgotten to subtract off F0 in eqn 5, which I did not notice! Not a big deal. But the change in meaning of the common symbol deltaTi(K) should be explicitly noted, and (if I am correct) the eqn corrected.

          • Matthew, “It looks like in eqn 5, deltaT,(K) is supposed to be the uncertainty in the temperature increase from time i-1 to time i, which is proportional to the change in forcing from time i-1 to time i.

            No, ΔT_i(K) is the change in projected temperature due to the ΔF_i change in forcing. All the uncertainty is calculated from ±4 W/m^2.

            You wrote, “That seems to imply that eqn 5 describes only the annual temperature change caused by an annual forcing change.

            And so it does.

            That interpretation is consistent with the uncertainty in eqn 5, which is the uncertainty of a 1 year temp change, not the uncertainty accumulated over K years.

            Yes. The uncertainty over n years (I reserve K for Kelvins), is the rss of the individual years. Eqn. 5 shows how individual years are calculated.

            that makes it seem that deltaTi(K) is the change on the last step to the final K.

            No, it doesn’t. It makes it seem like each ΔT_i is calculated from ΔF_i.

            Eqn. 1 is the expression for ΔT_t. Eqn. 5 is the expression for ΔT_i. They are not the same.

            Eqn. 1 is about total change. Eqn. 5 is about the individual step.

            Your problem, Matthew (if you are indeed Matthew, which I seriously doubt) is that you’re not reading carefully.

            So, the simplest explanation is that you changed the meaning of delta Ti(K) and forgot to drop F0 from the difference in forcing.

            No. There is no ΔT_i in eqn. 1. The simplest explanation is that you’re analytically careless, whoever you are.

            Also notice that the uncertainty in eqn 1 can now be written as: a^2 = sum on i ui^2.

            There is no uncertainty in eqn. 1.

            Notice also that, conditional on the error in the cloud feedback parameter, eqn 1 does not describe a random walk.

            Eqn. 1 is not meant to describe a random walk. Eqn. 1 describes the linear extrapolation of GHG forcing.

            Eqn. 5 doesn’t describe a random walk, either. Eqn. 5 is about uncertainty, not about error.

            you have forgotten to subtract off F0 in eqn 5, which I did not notice!

            I forgot no such thing. F_0 should not be subtracted from eqn. 5. Eqn. 5 is just a single step version of eqn. 1, with the LWCF calibration uncertainty added.

            Who are you, really. I seriously doubt you’re Matthew Marler. Matthew has given every indication of careful and knowledgeable thinking. You have given neither.

          • Pat Frank: Who are you, really. I seriously doubt you’re Matthew Marler. Matthew has given every indication of careful and knowledgeable thinking. You have given neither.

            That was uncalled for.

            eqn. 1: ΔT_t = 13.86(F_0+ΔF_t)/F_0
            eqn. 5: ΔT_i = 13.86(F_0+ΔF_i)/F_0

            ΔT_t – ΔT_i = ΔΔT = [13.86(F_0+ΔF_t)/F_0] – [13.86(F_0+ΔF_i)/F_0]

            = 13.86[(F_0+ΔF_t)/F_0 – (F_0+ΔF_i)/F_0]

            =13.86[(F_0+ΔF_t)-(F_0+ΔF_i)/F_0]

            =13.86[(F_0+ΔF_t-F_0-ΔF_i)/F_0]

            =13.86[(ΔF_t-ΔF_i)/F_0]

            ΔΔT = 13.86[ΔΔF/F_0]

            You’re not calculating a ΔT, Rich. You’re calculating a ΔΔT from a ΔΔF.

            That equation is not a simplification of eqn. 5.

            You’ve dropped the Δ on T and ignored that the Δ operators on the F’s combine in a subtraction.

            If that was what you intended to write, then you need to rewrite eqn 5. If you are really calculating ” ΔΔT from a ΔΔF”, then you need to drop F0 from the computation of ΔΔF.

            Equation 1 gives the change in temperature after a total of i steps in forcing change. If that is what you meant, it is definitely non-standard notation. Index “i” seldom denotes the number of steps over which a summation is taken, which is usually represented by a capital letter, viz {i: i = 1, …, N} at the foot of the summation sign; or “i=1” at the foot of the summation and “N” at the head of the summation sign.

            Just as you have used delta Ti to represent two different things, so you have used i to represent two different things.

          • Me, Who are you, really. I seriously doubt you’re Matthew Marler. Matthew has given every indication of careful and knowledgeable thinking. You have given neither.

            Matthew, “That was uncalled for.

            Sorry, Matthew. But that post seemed very sloppy and unedited to me. It didn’t look or read like anything else of yours I’d ever read. Apologies for any offense.

            You wrote, “If you are really calculating ” ΔΔT from a ΔΔF”, then you need to drop F0 from the computation of ΔΔF.

            I’m not calculating a ΔΔT from anything. I’m not using a ΔΔF anywhere. I’m calculating exactly what eqns. 1 and 5 show.

            If that is what you meant, it is definitely non-standard notation. Index “i” seldom denotes the number of steps over which a summation …

            It’s standard where I come from. When N is unspecified, it just means one sums of whatever number of steps one likes.

            Just as you have used delta Ti to represent two different things, so you have used i to represent two different things.

            No, I have not. Eqn. 1 has ΔT_t, while eqn. 5 has ΔT_i.

            Page 2: “In Equation 1, ΔT_t is the total change of air temperature in Kelvins across projection time t,…

            On page 10, eqn. 5 ΔT_i is conditioned upon the step-wise forcing term ΔF_i. The “i” index always represents single steps.

            It all seems very obvious. The source of your misapprehension, not.

    • Equation 1 gives the change in temperature after a total of i steps in forcing change.

      It is the total change in temperature after those steps; hence the Greek capital sigma on the (delta)F_i indicating a summation. The index on T is “t,” for total time.

      Equation 5 gives the change in temperature over a single annual step. Hence the index “i,” which you have removed. Notice: no Greek capital. No summation.

      The (delta)T does not compare to time t = 0. It is the change in temperature over the step after time t = i-1.

      The delta in (delta)T_i in eqn. 5 indicates the change in temperature due to the delta forcing in step “i.” You remove it and the meaning changes.

      Eqn. 5, in other words, is a single step within the totality of steps represented by eqn. 1.

      The (delta)F_i come from the IPCC standard forcings. This is made abundantly clear in the paper. One does not have to derive the (delta)F_i from anything.

      Rich, you have recognized that (delat)F_i is the forcing change after step i-1.

      How, then, is it possible for you to have missed the meaning that (delta)T_i is the temperature change after step i-1? How did you miss that? Incredible.

      There is no mistake. You’ve just imposed your mistaken understanding of what I wrote.

      Your comments have been consistent that way, Rich.

      • Pat Frank: Equation 1 gives the change in temperature after a total of i steps in forcing change.

        Ah. Now I see that your intention in writing delta Ti(K) was merely to identify the units as Kelvin. I mistakenly took it to be the number of steps, ie, the ending value of the index of summation i. You have described i as both the index of summation and its upper limit.

        Readers who did not make my inferential mistake of using K in two senses in the same equation will naturally be confused over your summation index.

        Again, I doubt that is what Nick Stokes was aiming at. Most commonly N would be used as the upper limit of summation, but you could use I.

        In reading the supplemental information, no one caught this error (as it seems to me) before See-oh-two. The critiques largely missed the main points, or were extraneous (unless someone reveals a previously hidden meaning some time.)

        This is still an good and important paper.

        • Rich, “I wrote:

          T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)

          Yes, and it’s wrong.

          eqn. 1: ΔT_t = 13.86(F_0+ΔF_t)/F_0
          eqn. 5: ΔT_i = 13.86(F_0+ΔF_i)/F_0

          ΔT_t – ΔT_i = ΔΔT = [13.86(F_0+ΔF_t)/F_0] – [13.86(F_0+ΔF_i)/F_0]

          = 13.86[(F_0+ΔF_t)/F_0 – (F_0+ΔF_i)/F_0]

          =13.86[(F_0+ΔF_t)-(F_0+ΔF_i)/F_0]

          =13.86[(F_0+ΔF_t-F_0-ΔF_i)/F_0]

          =13.86[(ΔF_t-ΔF_i)/F_0]

          ΔΔT = 13.86[ΔΔF/F_0]

          You’re not calculating a ΔT, Rich. You’re calculating a ΔΔT from a ΔΔF.

          That equation is not a simplification of eqn. 5.

          You’ve dropped the Δ on T and ignored that the Δ operators on the F’s combine in a subtraction.

          Your entire approach is wrong.

          • Oh, WordPress has done it again and we are now commenting here on a comment of mine below. Anyway Pat, I agree that this Delta stuff gets confusing. In fact, your argument below looked so compelling that I thought I was going to have to apologize to you, but see below.

            I still think it would have been much clearer if instead of writing ΔT_t for T_t-T_0 you had simply used an anomaly from the initial temperature and just called it T_t, and that is why I dropped that Δ in my analysis. However, to get us back onto the same base I am going to put those Δ’s back in. But if you are going to use Δ’s with different spans, it is necessary to subscript them in order properly to follow the algebra. Thus Δ_i X_j = X_j – X_{j-i}.

            Let us now examine your claim that

            “T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)”
            Yes, and it’s wrong.”

            In the new Δ terminology that equation of mine now reads:

            Δ_1 T(t) = 13.86 Δ_1 F_t/F_0

            Let us follow the consequences of Equation (1), which in simplified form is

            Δ_t T_t = 13.86(F_0+sum_1^t Δ_1 F_i)/F_0
            = 13.86(F_0+ Δ_t F_t)/F_0

            So that verifies your statement:

            eqn. 1: ΔT_t = 13.86(F_0+ΔF_t)/F_0

            provided that Δ is Δ_t here. Next you write

            eqn. 5: ΔT_i = 13.86(F_0+ΔF_i)/F_0

            Just as Δ had to be Δ_t in Equation (1), now it isomorphically has to be Δ_i. But now there is a problem, because you wrote earlier that “Equation 5 gives the change in temperature over a single annual step”, which would imply that the LHS is Δ_1 T_i = T_i – T_{i-1}. So the equation above is not equivalent to Equation (5) in the paper because it has Δ_i instead of Δ_1 on the LHS. Now to get back to my equation we can let i = t-1 and get:

            Δ_1 T_t = T_t – T_i = Δ_t T_t – Δ_i T_i

            and with this I can now follow your own further analysis, adding subscripts to the Δ’s:

            Δ_t T_t – Δ_i T_i = ΔΔT = [13.86(F_0+Δ_t F_t)/F_0] – [13.86(F_0+Δ_i F_i)/F_0]
            = 13.86[(F_0+Δ_t F_t)/F_0 – (F_0+Δ_i F_i)/F_0]
            =13.86[(F_0+Δ_t F_t)-(F_0+Δ_i F_i)/F_0]
            =13.86[(F_0+Δ_t F_t-F_0-Δ_i F_i)/F_0]
            =13.86[(Δ_t F_t – Δ_i F_i)/F_0]

            This is 13.86(F_t-F_0-F_i+F_0)/F_0 = 13.86 Δ_1F_t/F_0 since i = t-1. QED!

            I wouldn’t be so severe as to say that “your entire approach is wrong”, but only to say that this element of it is.

            Rich.

          • Rich ΔT_t does not equal T_t-T_0.

            ΔT_t = 13.86*[(ΔF_total +F_0)/F_0] > 13.86 C.

            T_0 = 13.86 C. Eqn. 1 ΔT is the step-wise changed temperature and includes a summation over ΔF_i.

            Eqn. 5 is one single ΔF_i incremental step of the summation in eqn. 1.

            You wrote, “Δ had to be Δ_t in Equation (1)

            I presume you mean ΔF had to be ΔF_t in Equation (1). If so, then only at the t_th step.

            Eqn. 1 is generalized and includes a summation, so that ΔF must be ΔF_i.

            Let me reproduce the central and telling element of your derivation, Rich.

            You wrote, “Δ_t T_t – Δ_i T_i = ΔΔT = [13.86(F_0+Δ_t F_t)/F_0] – [13.86(F_0+Δ_i F_i)/F_0]
            ΔΔT = …

            This is [ΔΔT] = 13.86(F_t-F_0-F_i+F_0)/F_0 = 13.86 Δ_1F_t/F_0 since i = t-1. QED!

            You’ve derived a ΔΔT, just as I reported. ΔΔT ≠ ΔT.

            I’ll post an illustration demonstrating that.

          • I illustrate here what you’re doing wrong, Rich.

            For the sake of discussion, let F_0 = 30 W/m^2 and let ΔF_i = constant = 1 W/m^2.

            We can all agree that at time t=0, ΔF_i = 0.

            We start with your derived expression, Rich, which I present as given:

            ΔT = 13.86 C*(ΔF_i/F_0), yielding 0 C at t = 0, ΔF_i = 0.

            At time t = 1, ΔT_1 = 13.86 C*(1W/m^2/30 W/m^2) = 0.462 C

            At t = 2, ΔT_2 = 13.86 C*(1W/m^2/30 W/m^2) = 0.462 C

            At t = n, ΔT_n = 13.86 C*(1W/m^2/30 W/m^2) = 0.462 C
            +++++++++++

            Now we take eqn 1 (or eqn. 5) as they appear in the paper.

            There, slightly rearranging, eqn 1 = 13.86 C*[(Σ_iΔF_i+F_0)/F_0] =

            At t = 0, ΔF_i = 0 and ΔT = 13.86[(0+30 Wm^-2)/30 Wm^-2] = 13.86 C.

            At t = 1, ΔF_i = 1 W/m^2 and ΔT_1 = 13.86 C*[((1+30)W/m^2)/30 W/m^2] = 14.322 C; increment = 0.462 C

            At t = 2, the summation yields ΔF_i = 2 W/m^2 and ΔT_2 = 13.86 C*[((2+30)W/m^2)/30 W/m^2] = 14.784 C; increment = 0.462 C

            At t = n, the summation yields ΔF_n = nW/m^2, and ΔT_n = 13.86 C*[((n+30W/m^2)/30 W/m^2] = n*0.462 C + 13.86 C; increment over t_n-1 = 0.462 C.
            ++++++++++++

            ΔT(paper) is the step-wise changed temperature.

            ΔT(Rich) is the stepwise increment of temperature change.

            In the paper ΔT_n – ΔT_n-1 = ΔΔT = 0.462 C = ΔT(Rich).

            ΔΔT(paper) = 0.462 C. ΔT(Rich) = 0.462 C.

            Therefore, ΔT(Rich) = ΔΔT(paper), as derived above.

            So, now the mistake is manifest. ΔT(Rich) is not ΔT(paper).

            Your ΔT is in fact paper ΔΔT.

            This makes sense because the presentation of both eqn. 1 and eqn 5 is in terms of ΔT.

            Although you wrote your difference as T_n – T_n-1 = ΔT_n, Rich, you should have written it as ΔT_n – ΔT_n-1 = ΔΔT_n.

            Big mistake.

            Your entire criticism based in a misconception.

            I feel badly to point out that yet once again you have imposed your idea of what you think I should have meant onto what I in fact did mean.

            This sort of inadvertent straw man argument has been a constant in your criticisms.

            Every one is misconstrued from the start, misrepresenting or misconstruing what I actually did.

            is it really so hard to look at the content of the paper and figure out what it actually says and what it means? It seems to me discerning intended meaning is the first obligation a reader owes to an author.

            Especially in science, where we intend to be monosemous.

          • Pat [Sep25 8:18pm in case WordPress puts this somewhere strange],

            First of all I’d like to stress that everything I am doing here is in good faith. I genuinely want to understand how you obtain your results, though I do confess that as I have a gut instinct that the mathematical sequence is not well proven, I am indeed looking out for errors. You have to be prepared for that in peer review, whether formal or on a blog. If I find no problems I shall be relatively happy because I have little faith in GCM projections anyway.

            Your illustration has helped me to see that I had been ignoring the offset F_0 so you are right that ΔT_t in your parlance cannot equal a T_t-T_0. In fact I think we can say that your Δ is a delta, Jim, but not as we know it. To clarify things for me and hopefully others, I am going to use the following terminology:

            Δ_i X_n = X_n-X_{n-i}
            ‘Δ’ = the one in your Equation (1)
            “Δ” = the one in your Equation (5)
            R_t = 13.86 sum_{j=1}^t Δ_1 F_j/F_0

            Note that R_t represents a change in temperature across the t steps from 1 to t. Now we can rewrite your equations, starting with (1).

            ‘Δ’T_t = R_t + 13.86 + a (1.R)

            Moving to Equation (5), note that Δ_1 R_i = R_i – R_{i-1} = 13.86 Δ_1 F_i/F_0 and is a temperature change across one time step, so

            “Δ”T_i = 13.86((F_0 + Δ_1 F_i)/F_0) = 13.86 + Δ_1 R_i (5.R)

            This shows that Equation (1) is a change of temperature across t steps (a Δ_t) plus two additives 13.86 and a, and Equation (5) is a change of temperature across 1 step (a Δ_1) plus one additive 13.86. Why the crazy 13.86 additives? Well it’s your paper so you can write those equations how you like, with Δ’s that mean different things, or what we wish them to mean as if we were with Alice in Wonderland, and I can no longer say one is “wrong” or inconsistent with the other, but I doubt that I am the only reader who finds them pretty confusing.

      • Pat,

        I wrote:

        T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)

        Imagine writing i instead of t there, and then compare with Equation (5.2) as per my simplification:

        T_i +/- u_i = 13.86 (1+Delta F_i/F_0) +/- 1.665 (5.2)

        Now, it may well be that you changed the meaning of Delta T_i from T_i-T_0 in Equation (1) to T_i-T_{i-1} in Equation (5), but that is really bad practice and makes papers very hard to read and verify, and a good referee would have picked up on that. But even if we allow you that, you cannot then retain the 13.86*1 in my (5.2) here, or equivalently the 0.42x33Kx[(F_0 in the (5.2) in your paper, as I believe Matthew noted above, because to get to (5.2) from (1) that term gets cancelled out. That >is< an error, although it is not necessarily an important one, because after this point you concentrate on the u_i's and not the Delta T_i's whichever version you happen to be using on a Tuesday.

        • Rich, “you changed the meaning of Delta T_i from T_i-T_0 in Equation (1) to T_i-T_{i-1} in Equation (5),…

          T_0 in eqn. 1 is 13.86 C, Rich. Eqn. 1 (delta)T does not equal T_i – T-0.

          Eqn. 5 gives the incremental change of any one (delta)F_i.

          How does it come that you have so much trouble reading what is actually there?

          • Because you keep changing it. Just today we have here
            “T_0 in eqn. 1 is 13.86 C, Rich. Eqn. 1 (delta)T does not equal T_i – T-0.
            Eqn. 5 gives the incremental change of any one (delta)F_i.”

            But here they are the same:
            “This makes sense because the presentation of both eqn. 1 and eqn 5 is in terms of ΔT.”

          • Nick, “But here they are the same:
            “This makes sense because the presentation of both eqn. 1 and eqn 5 is in terms of ΔT.”

            Yes and eqn. 1 has a summation that eqn. 5 does not.

            And eqn. 1 has a ΔT_t, while eqn. 5 has a ΔT_i.

            They are the same ΔT; one in summation, one the single-step contribution.

            Very deep math, but I know you’ll get it if you try, Nick.

    • “In Pat’s paper, it appears that Equation (5) is incorrectly derived from Equation (1).”
      Well, yes, that is one of its problems. Basically it drops a Σ. It goes from F₀+ΣΔFᵢ to F₀+ΔFᵢ. And muddles its units in 5.2. This is really just a version of saying that you shouldn’t compound the supposed error in F.

      Of course, the real problem is that error propagation in (1) has nothing to do with error propagation in a GCM.

      • I”m glad to see you opportunistically agreeing with Rich’s mistaken derivation, Nick. It shows that your prejudice overcomes not only your sense but your math skills.

        The muddle is yours, Nick. Why would not the equation of a single step, i, drop the Σ indicating a sum over i?

        If I had kept the Σ in eqn. 5.1, 5.2, that would have been a mistake.

        But you knew that, didn’t you. But you proceeded with the falsehood anyway.

        GCMs are demonstrated to be no more than linear extrapolation machines. Linear error propagation is exactly appropriate.

        • ” Why would not the equation of a single step, i, drop the Σ indicating a sum over i?”
          Because it means something totally different. F₀+ΣΔFᵢ means you take all the ΔFᵢ and add them to F₀. It also means the i suffix does not apply to the result. There is no way in that process you deal with a single F₀+ΔFᵢ, and no way that can sensibly acquire an index i.

          If there were any logic in what you say, it would involve the partial sum of the first i terms, not just one plucked out of the sum. And then there would be a whole other story on its uncertainty.

          • “It also means the i suffix does not apply to the result. There is no way in that process you deal with a single F₀+ΔFᵢ, and no way that can sensibly acquire an index i.”

            Are you actually thinking about what you are saying?

            The process is made up of iterative steps. The term ΣΔF *is* a sum of the iterative steps that came before when considering any specific step i, i.e. the sum from i=1 to i=i-1. So why wouldn’t it have an index i?

            I still think you are just trolling this thread to see what you can cause to be generated.

          • “a sum of the iterative steps that came before when considering any specific step i”
            Yes. So you might add ΔFᵢ to the sum of F₀ and previous ΔF. But why would you add the i-th ΔF to the original F₀ without anything else?

          • Nick,

            Look closely at these two equations and tell me what the difference is.

            ∆Tt(K) = fCO2 x 33K x [(F0 + Σⁱ∆Fi)/F0] + a (1)

            ∆Ti(K) ∓ ui = 0.42 x 33K x [(F0 + ∆Fi ∓ 4Wm2)/F0] (5.1)

          • Nick, “…no way that can sensibly acquire an index i.

            Index i indicates step number. When i = 1, you in fact, “deal with a single F₀+ΔFᵢ,” namely F₀+ΔF_1.

            Really weak, Nick.

          • “When i = 1, you in fact, “deal with a single F₀+ΔFᵢ,” namely F₀+ΔF_1.”
            And when i=5, why would you add ΔF_5 to the original F₀? But you said
            “No, ΔT_i(K) is the change in projected temperature due to the ΔF_i change in forcing.”
            So what is F₀ doing there? If the change in forcing were zero, the change in temperature would have been 0.42 * 33K = 13.9 K (in 1 year!). If there were no change in forcing at all, the temperature would have gone on increasing by those uniform steps forever.

            This isn’t just uncertainty. Eq 5 now describes a quite different, and totally implausible model, with rapid linear warming not in response to added GHGs, but to the preindustrial (1900, you say) F₀.

      • Nick Stokes: Of course, the real problem is that error propagation in (1) has nothing to do with error propagation in a GCM.

        Once again, this is about uncertainty propagation, and a in eqn 1 can be written in terms of the ui in eqn 5, as I did above.

        • MRM,
          “Once again, this is about uncertainty propagation”
          Once again, the paper is titled “propagation of error”.

          “a in eqn 1 can be written in terms of the ui in eqn 5”
          The coefficient a in Eq 1 has nothing to do with uncertainty. It is specified:
          “Finally, coefficient a = 0 when ΔTₜ is calculated from a temperature anomaly, but is otherwise the unperturbed air temperature.”

          ” I mistakenly took it to be the number of steps”
          The number of steps is specified, and is neither K nor N:
          “ΔTₜ is the total change of air temperature in Kelvins across projection time t”

          “So, the simplest explanation is that you changed the meaning of delta Ti(K) and forgot to drop F0 from the difference in forcing”
          Well, Eq 5.1 could represent a difference, but there are a lot of words that say it doesn’t:
          “Where ±uᵢ is the uncertainty in air temperature, and ±4 Wm⁻² is the uncertainty in tropospheric thermal energy flux due to CMIP5 LWCF calibration error. The remaining terms of equations 5 are defined as for equation 1. In equations 5, F₀+ΔFᵢ represents the tropospheric GHG thermal forcing at simulation step “i.” The thermal impact of F₀+ΔFᵢ is conditioned by the uncertainty in atmospheric thermal energy flux.”

          • Nick, “Once again, the paper is titled “propagation of error”.

            Propagation of error calculates the uncertainty in a result. See here.

        • Nick Stokes: Once again, the paper is titled “propagation of error”.

          Yep, that’s the title. The paper is about the propagation of uncertainty.

          The number of steps is specified, and is neither K nor N:
          “ΔTₜ is the total change of air temperature in Kelvins across projection time t”

          The number of steps is not represented in the equation.

          I don’t know where Matthew Marler got the idea that he was doing something with correlations.

          The correlations appear where appropriate in eqns 3 and 4. In Eqns 5, the sequence of annual increments is deterministic, conditional on the value of the cloud feedback parameter and its random error, so correlations do not enter the calculation. If the process were a random walk, as you seem to persist in believing it to be, with a random increment each year, then the correlations of the successive increments would need to be included in calculating the error variances of the annual increments.

          You seem to be stumbling over the distinction between error and the uncertainty of the size of the error. You also seem to be stumbling over the distinction between distributions and conditional distributions (c.f. my dice example). If the truth of the value of the cloud feedback parameter, and hence the error in the current estimate, ever became known, then the errors induced in the temp series by the error in the parameter estimate could be propagated through the calculations, as in your response to the meter stick metaphor. Instead, all we have is an uncertainty (a range of probable values) in the parm value, so it is the uncertainty that propagates, as described by Pat Frank.

          • ” correlations do not enter the calculation”
            So what is the basis for adding the uᵢ in quadrature, in Eq 6, if it isn’t an εKε calculation?

            “The number of steps is not represented in the equation”
            He says the ΣΔFᵢ is being summed over “projection timesteps”. He says ΔTₜ is over projection period t. It’s true that he doesn’t define the timestep right there, but pretty soon he is talking about annual.

          • Nick Stokes: It’s true that he doesn’t define the timestep right there, but pretty soon he is talking about annual.

            Sometimes I think that you do not bother to read. The number of time steps is not specified in the formula.

          • An equation is supposed to tell you how to do a calculation. If you specify a sum, it has to be over a specified (and enumerated) set of numbers. I cannot see how it would be other than the number of time steps here.

          • “An equation is supposed to tell you how to do a calculation. If you specify a sum, it has to be over a specified (and enumerated) set of numbers. I cannot see how it would be other than the number of time steps here.”

            Since this is an emulation and analysis of the CGM’s forecasts for the climate don’t you suppose the number of steps and their size would be the same as the CGM’s?

  75. Note 6: What does the uncertainty interval mean?

    Specifically, what does the interval of roughly -15K to +17K at 2100 in Figure 6(B) mean? As it is a +/-1 sigma, is it a 68% confidence interval for something? If so, is it for:

    a) each CMIP5 model run’s projection to 2100
    b) the ensemble mean’s projection to 2100
    c) something else

    If not, then what is it?

    • “As it is a +/-1 sigma”
      It seems to be. It seems the arithmetic is primitive, straight from Eq 6, based on variances. The error bars are
      ±.42*33*4/33.3 sqrt(n) = ±1.665*sqrt(n)K where n is the number of years elapsed. So in Fig 6, n=100 and it is ±16.65 about each curve at the end. And in Fig 8 n=62 at the end and it is ±13.1 about the marked curves.

      I don’t know where Matthew Marler got the idea that he was doing something with correlations.

      • BTW, just to point out the failure of dimensions, that is
        0.42 (dimensionless)*33K *4 Wm⁻² year⁻¹ * sqrt(n years) = K/sqrt(year).
        Or if you prefer 4 Wm⁻², as it seems to be more often lately, then it is K*sqrt(year)
        Odd units for a temperature uncertainty.

        • For i = year 1: ±[(0.42 * 33K * 4 Wm⁻² year⁻¹)/F_0]_year_1

          = ±[(0.42 * 33K * 4 Wm⁻²)/F_0]_1

          ±[(0.42 *33K * 4 Wm⁻²)/F_0]_i

          =>±[(0.42 * 33K * 4 Wm⁻²)/F_0]_n

          The left side of eqn. 5 has an “i” index. That index is not on the right side ±uncertainty term. The “year⁻¹” in “4 Wm⁻² year⁻¹” is cancelled by the 1-year index.

          I described that above already, here

          You already agreed with the formalism, Nick, here

          Which agreement I acknowledged, here

          And now you falsely raise the whole thing again, as through it were unsettled.

          • Nick appears to be a troll, trolling the thread hoping for someone to make a misstatement he can use in the faint hope of falsifying your entire paper.

          • “You already agreed with the formalism”
            I said it could probably be made to work (although it was nuts). But you clearly haven’t made it work.

            “And now you falsely raise the whole thing again”
            No, I showed what the actual calculation was. The product of five numbers, one (time in years) being subject to a square root. And whether you prefer your Tuesday unit of 4 Wm⁻² year⁻¹ or Wednesday unit of 4 Wm⁻², the units just don’t make sense.

          • They do make sense. And they make sense to every serious thinker here, but not to you, Nick.

            They’ll never make sense to you here, for reasons of policy.

    • See what Vasquez and Whiting write about propagating empirical uncertainty through a set of calculations, Rich. That’s what it is.

      • Pat, assuming you are referring to my question in Note 6 somewhat further above (dratted WordPress), is not Vasquez and Whiting about uncertainty propagation in general? It is the specific instance of your paper which I want to understand. If I do not, then I have no way of judging if your overall conclusions are well founded. Specifically, your conclusion seems to be that GCM uncertainties at 2100 are so large as to be useless. In that context, what exactly does Figure 6(b) represent so that it affirms your conclusion?

        TIA,
        Rich.

        • Legend to Figure 6: “Panel (B) the identical SRES scenarios showing the (+/-)1sigma uncertainty bars due to the annual average (+/-)4 Wm^-2 CMIP5 TCF long-wave tropospheric thermal flux calibration error propagated in annual steps through the projections as equation 5 and equation 6.

          That’s what it means, and that’s why it affirms my conclusions.

          • Pat, thanks, so the 1-sigma uncertainty bars in Panel B do relate one-to-one to the bars in Panel A, but they are calculated differently. Now I think the bars in Panel A can be used to predict the spread which would occur if new model runs were made. But your point in Panel B is that those runs could have strayed ever so far from reality because of the physical uncertainty in the parameters of those models.

            Is that a fair summary? And sorry for being a bit slow on this 🙂

  76. Note 7: How does a cloud in 1990 affect me now?

    In 1990 my third daughter was born. 9 weeks later I took her to the surgery for a routine check-up. It was scorching hot, and it turned out that around the time of that visit the local town recorded the highest known temperature for my country (since broken a couple of times). But we made it there and back again fine. There was not a cloud in the sky, as you might expect.

    How does the lack of clouds that day affect the temperature today? If my country had been Tuvalu, then the lack of clouds would have heated the water, which could then have heated deeper waters, and some of that heat could be returning via deep ocean currents to the atmosphere today. But my country is not Tuvalu, it is England. It is well known that ground does not retain significant heat for any length of time, because of highly effective radiation from warm surfaces, so it is hard to see how the lack of cloud in 1990 could affect temperatures today. It follows that any error, or uncertainty, in modelling clouds over land cannot persist in the error between model and reality today.

    Over the seas is another matter however. But now it is not sufficient to consider TCF (Total Cloud Forcing), but rather the distribution of CF over land and sea (and different portions of sea may differ in terms of the long term effects of the forcing). And in this case it is now necessary to take account of correlations across time and space. For example, if OCF (Ocean Cloud Forcing) had low model uncertainty and LCF (Land Cloud Forcing) had a compensating high model uncertainty, very little CF uncertainty should be inferred to propagate from one year to the next. I am not saying this situation obtains, but we can’t simply assume it doesn’t. Across time, if a GCM in one year of calibration has a positive OCF error, but an anticorrelated tendency to a negative error the following year or two (as might be predicated by El Nino/La Nina), then again those values will tend to cancel out.

    In conclusion I would say that you can meaningfully linearize the means of GCMs, but you can’t linearize the noise, in the sense of year after year sampling supposed uncorrelated uncertainties in order simply to add up the variances thereof to arrive at the paper’s high uncertainty values.

    Or as Nick Stokes, with whom I tend to disagree on climate alarm but am with him on this one, said: “the real problem is that error propagation in (1) has nothing to do with error propagation in a GCM”.

    • Forgot to mention “took her to the surgery” meant “walked half a mile with a pram to get there”, and the local town’s own record has not been beaten but other places have done so, one of them this year.

    • See-owe to Rich: I am not saying this situation obtains, but we can’t simply assume it doesn’t. Across time, if a GCM in one year of calibration has a positive OCF error, but an anticorrelated tendency to a negative error the following year or two (as might be predicated by El Nino/La Nina), then again those values will tend to cancel out.

      In conclusion I would say that you can meaningfully linearize the means of GCMs, but you can’t linearize the noise, in the sense of year after year sampling supposed uncorrelated uncertainties in order simply to add up the variances thereof to arrive at the paper’s high uncertainty values.

      What you know is that, across a variety of GCMs, all of them that have been tested to date, the relationship between CO2-induced forcing and temperature is linear, to a high degree of approximation. Whatever effects there might be of some forcings, or parameterizations, cancelling others (we can be pretty sure that they exist because of the extensive “tuning” that has occurred), the result for all the GCMs as they are is that the relationship between their forcing input and temperature output is nearly linear, to a high degree of approximation. I hope you don’t mind the repetition; at the end of thinking what else might be in the models and what the physics really is, that relationship is pretty well established.

      Equations 1 and 5, if I understand Pat Frank’s delta-delta clarification of his notation, describe the uncertainty in forecasts from his linear model due to the uncertainty in the cloud feedback parameter used in the GCMs. Why should it not therefor be an accurate approximation to the propagation of uncertainty in the GCMs? Is there anything available now for which a case can be made it would provide a better approximation?

      Nick Stokes proposed complete reruns of the GCMs with a range of parameter values (I think he proposed that. He described complete reruns with a range of initial values.) I proposed bootstrapping. Right now these are not really feasible, but perhaps a few values from within the CI for the parameter — say the lower and upper 10% points and the median.

      Is there right now a procedure better than what Pat Frank has carried out?

      • See-owe to Rich: in the sense of year after year sampling supposed uncorrelated uncertainties

        And at the risk of more exasperating repetition, let me repeat that the sampling does not occur year after year. The sampling occurs at the start, with the choice of the parameter value to put into the program (a value that is undoubtedly in error but an unknown error); the uncertainty that accumulates year after year is a result of that single random draw.

        • Matthew, I was sure that the paper said that there was a new random draw each year – will have to check that later.

          You wrote “describe the uncertainty in forecasts from his linear model due to the uncertainty in the cloud feedback parameter used in the GCMs”. I’d like Pat to confirm that, as it is the nub of my question in Note 6. It would at least give me some more concrete mathematics to understand and analyze, whereas I am still groping for the linkages between the various components. (That’s it for today UK time.)

    • Rich, “ Across time, if a GCM in one year of calibration has a positive OCF error, but an anticorrelated tendency to a negative error the following year or two (as might be predicated by El Nino/La Nina), then again those values will tend to cancel out.

      Cancelling errors in a calibration experiment do not remove uncertainty in a prediction. They combine as a rss uncertainty that conditions the reliability of the prediction.

      Cancelling errors do not improve physical theory. They just hide the mistakes in physical theory.

  77. Tim (Sep24 6:18am), on uncertainty.

    Unfortunately WordPress is not letting me reply directly to your comment. You ask why uncertainties have to be formalized into random variables. The reason is that there is a mathematics of random variables which shows how to combine them. As far as I can tell, even though uncertainty may be a unique animal, the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them. And without that mathematics you ain’t got nothing, or you might have to be over pessimistic and assume the worst case that for a bounded uncertainty interval either endpoint is a feasible value. In that case you have to sum the interval sizes instead of the sophistication of Equation (3).

    Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.

    Now, I understand that you think that in some cases the uncertainty is not a random variable because it is always the same even if we don’t know exactly what it is. Well, if we knew what it was we would just treat it as a bias and subtract it off. And for a single uncertainty interval if you wanted to claim that any value in that range was equally likely, no-one would stop you unless they had extra knowledge about the instrument. But for combining multiple uncertainty intervals into a single one, the statistics of random variables is the only good tool we have, and at that point we need at a minimum an estimate of the standard deviation for the uncertainty.

    Rich.

    • See-owe to Rich: And without that mathematics you ain’t got nothing, or you might have to be over pessimistic and assume the worst case that for a bounded uncertainty interval either endpoint is a feasible value. In that case you have to sum the interval sizes instead of the sophistication of Equation (3).

      I was planning at some time to ask Nick Stokes if he thought that summing the interval sizes instead of summing the variances and taking their square roots would be a preferable procedure.

      Nick Stokes: So what is the basis for adding the uᵢ in quadrature, in Eq 6, if it isn’t an εKε calculation?

      • “Why should it not therefore be an accurate approximation to the propagation of uncertainty in the GCMs?”
        “Is there right now a procedure better than what Pat Frank has carried out?”

        This is profoundly unscientific reasoning. You have to show that it is an accurate approximation. And “at least it’s something” is not a validation of Pat Frank’s procedure (which is nuts).

        I gave a demonstration in my article on DEs of how error propagation and having a common solution are totally uncoupled.

        “the uncertainty that accumulates year after year is a result of that single random draw.”
        “to ask Nick Stokes if he thought that summing the interval sizes instead of summing the variances and taking their square roots would be a preferable procedure.”
        Why should uncertainty accumulate at all?
        Suppose you made an error in, say, overestimating the solar constant. The GCM would simply solve for the temperature of a slightly warmer planet. It isn’t an error that accumulates year after year until the seas boil. Same with cloud cover.

        • Nick Stokes: It isn’t an error that accumulates year after year until the seas boil.

          Once again you are confusing error with uncertainty — the effects of the error accumulate, but you do not know what they are, so the uncertainty accumulates. But the error in this case is in the forecast, not in the sun. Overestimating the solar constant will not make the seas boil.

          Some of the posts seem to claim that Pat Frank’s uncertainty analysis of the GCMs can’t be accurate because it opens the possibility that some of the GCMs might forecast physically impossible states. You have claimed above that the error in the model treatment of solar power can’t propagate because then the seas would boil. If the error in the model might produce a physically impossible forecast, and the analysis shows that such a forecast is compatible with the uncertainty in the parameter estimate, that is not a flaw in the analysis. It’s limitation of knowledge and modeling that ought to be recognized.

          This is profoundly unscientific reasoning. You have to show that it is an accurate approximation.

          He has shown that the approximations that can be tested so far are accurate, most importantly the linearity in the CO2-Temp input-output relationship of the GCMs. Given that, his uncertainty propagation is accurate. We are stuck with forecasts that have not been shown to be accurate (but are defensible), and an uncertainty estimate that is defensible and can’t be improved upon — and is higher than most people imagined possible. The “scientific” approach is to continue to improve on the approximations.

        • “the effects of the error accumulate, but you do not know what they are, so the uncertainty accumulates”
          They don’t accumulate, as I said. An error in solar constant would simply be reflected by the GCM producing a consistently warmer or cooler world. The uncertainty/error mumbo jumbo makes uncertainty able to do more than actual, realised error.

          “If the error in the model might produce a physically impossible forecast, and the analysis shows that such a forecast is compatible with the uncertainty in the parameter estimate, that is not a flaw in the analysis. “
          It is not a flaw in the analysis of the simple model. But it refutes the application to the GCM, because the GCM, unlike the simple, has mechanisms that ensure that they will not reach that result. So you are not uncertain about whether a GCM might produce that result – it can’t.

          “an uncertainty estimate that is defensible”
          It is not defensible. As you have shown, Eq 5 makes no sense for Σ reasons, and PF is not open to fixing it. You haven’t commented on the absurd units of the results. I noted elsewhere the clear mistake in S6.2, where he forms an average by summing over n – the number of GCM-sat interactions, but divides by 20, the number of years. etc etc.

          But the big one, of course, is simply not actually analysing the effect of the differential equations that make up the GCM.

      • Well if you do, you’ll get a result which is about sqrt(n) times as big as before, so an uncertainty of 270*O(1)K instead of 30K after 81 years! Not a good idea.

    • “As far as I can tell, even though uncertainty may be a unique animal, the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them.”

      How do you get that? Have you read the examples using a ruler assumed to be 12″ long with an uncertainty of +/1 1″? That uncertainty interval grows with each iteration of using the ruler to measure the width of a room. The first iteration has an uncertainty range from 11″ to 13″. The second iteration will then have an uncertainty interval from 10″ to 14″ or +/- 2″. The third interval will have an uncertainty interval of 9″ to 15″ or +/- 3″. The uncertainty grows linearly with each iteration. There is no randomness to this at all.

      “you might have to be over pessimistic and assume the worst case that for a bounded uncertainty interval either endpoint is a feasible value.”

      That is *exactly* what an uncertainty interval means!

      “Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.”

      A covariance is a relationship between two *random* variables. The uncertainty is not a random variable. The uncertainty interval doesn’t assume random values at each step.

      “Well, if we knew what it was we would just treat it as a bias and subtract it off.”

      You are still confusing error and uncertainty. You can’t subtract off uncertainty like you can error or bias. It doesn’t work that way. To reduce uncertainty you have to work on the areas of uncertainty to make them more certain.

      “But for combining multiple uncertainty intervals into a single one, the statistics of random variables is the only good tool we have, and at that point we need at a minimum an estimate of the standard deviation for the uncertainty.”

      Again and again and again – uncertainty is not a random variable, it has no standard deviation. It is not the same thing as error. Go back to the ruler example. No amount of statistical analysis will change the fact that the ruler has an uncertainty of +/- 1″. The only way to lessen the uncertainty is to work on the ruler so that it’s results become less uncertain!

      • “The uncertainty grows linearly with each iteration. There is no randomness to this at all.”
        But Pat’s uncertainty grows as sqrt(n). He adds in quadrature. No-one has any coherent explanation for that.

        • “But Pat’s uncertainty grows as sqrt(n). He adds in quadrature. No-one has any coherent explanation for that.”

          Have you actually read Pat’s document for meaning? What does Eq 6 say?

          • Eq 6 says he does it. But no-one can say why.

            “uncertainty is not a random variable, it has no standard deviation”
            From Eq 3:
            “the uncertainty variance propagated into x is:”
            From Eq 4:
            “a serial propagation of physical error through n steps yields the uncertainty variance in the realization of the final state,”

            Standard deviation is the square root of variance.

            What do you think is being added in Eq 6?

          • ““uncertainty is not a random variable, it has no standard deviation””

            This is a misstatement and is misleading. My apologies. Uncertainty defines an interval not a probability function. It does not describe the probabilities for the various amounts of error that can occur within that uncertainty interval so it can’t give the variance of the error function. It can only tell you the interval in which the true answer lies. The uncertainty interval does have a relationship to the systemic error associated with a process. If it were possible to eliminate *all* error then of course the uncertainty interval would be zero. That is certainly not the case with the climate models.

            What do I think is being added? Exactly what he said:

            “For example, in a single calculation of x = f(u,v,…), where u, v, etc., are measured magnitudes with uncertainties in accuracy of ±(σu,σv,…), then the uncertainty variance propagated into x is,”

            “When states x0,., xn represent a time-evolving system, then the model expectation value XN is a prediction of a future state and σ2XN is a measure of the confidence to be invested in that prediction, i.e., its reliability.”

            I simply don’t get what is so hard about this.

          • “Exactly what he said”
            Well, he said what he did. But how is it to be justified? It relates to the question of is u random. If uᵢ are iid random variables, then adding variances is right. But iid is a big qualification. If there is covariance, that has to come in. But “uncertainty” is now a mush of maybe random, maybe not. Yet, while there are big conditions needed to be satisfied for random variables, people here are happy to add in quadrature with no such investigation at all.

          • “If uᵢ are iid random variables, then adding variances is right. But iid is a big qualification. If there is covariance, that has to come in. But “uncertainty” is now a mush of maybe random, maybe not.”

            Only in *your* mind, Nick. “u” is uncertainty, it is an interval that is related to a random variable but is not itself a random variable. The uncertainty associated with a ruler that is 12″ long +/- 1″ doesn’t change as you make iterative measurements! That +/- 1″ has a cumulative effect on iterative measurements. The impact on the climate models is no different. That might be an inconvenient truth for you to comprehend but it is the truth nonetheless.

          • “at least the reference is here to demonstrate it”

            Your reference, headed “A Summary of Error Propagation” and caveats at the outset:

            “Here are some rules which you will occasionally need; all of them
            assume that the quantities a, b, etc. have errors which are uncorrelated and random”

            Just what I said here; they need to be iid random variables for this claim of adding in quadrature. Now folks here keep saying that they aren’t random. In which case, it is even less likely that they are uncorrelated. Not a word has been spoken to justify that proviso. Matthew keeps saying that it is just one initial error which compounds. If so, it seems 100% correlated.

          • “No, they don’t.”
            The reference you just linked says they do. I just quoted it.

            Now you link back to your long rigmarole, with no quotes to support your contention. But just looking at Vasquez, which you most commonly quote, it’s true that they carelessly give a quadrature expression without explicit proviso in Eq 5. But in the very next paragraph, they say:
            “For the case of having correlation among the input variables,” [non-iid]
            and list an expression with correlations – not quadrature. You actually quote that in your Eqs 3 and 4. But you never explain why you can drop the correlations.

          • I’m just getting back to this today. As a Ph.D. statistician it is blindingly obvious to me that adding uncertainties in quadrature, as Nick Stokes well put it, arises from an assumption of underlying uncorrelated random variables – it’s hard to see how it could be anything else. Nevertheless, I am grateful to Nick for reading the Vasquez reference to confirm that.

          • “an assumption of underlying uncorrelated random variables”

            Uncertainty is not a random variable that changes from iteration to iteration. Once the uncertainty is baked in then it stays baked in. A ruler that is 12″ +/- 1″ doesn’t all of a sudden change to 12″ +/- 0.5″ on the third iteration of its use.

          • Tim [Sep26 2:56pm],

            Your ruler example can be viewed as a repeated use of a single random variate, which is an actual value taken by a random variable in your case uniformly distributed in (-1,1) inches. So you are right that if we use the ruler say 20 times we will get an error of 20x and because we don’t know x, even though we believe it iexists as a number, we have to allow an uncertainty interval of (-20,20).

            But in Pat’s paper the uncertainties are added “in quadrature” as Nick writes it. This comes about not by repeated use of a single random variate, but multiple samples, assumed uncorrelated, from the uncertainty distribution. In your case this would mean using 20 different rulers, each with uncertainty, or error as many people would actually call it, in the range of (-1,1). Assuming uniformity again, the variance is 1/3 and the final uncertainty would be taken to be +/-sqrt(20/3) = +/-2.58 for a 1-sigma interval or 5.16 for a 2-sigma interval. With the latter, the chance that the measurement was incorrect by more than 5.16 inches would be about 5%.

            I hope this helps.

    • Rich, “the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them.

      It doesn’t, actually. It just shows that scientists will use a method that yields a useful measure of reliability, even if the closed form axioms are not known to be sustained.

      See Vasquez and Whiting, “When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

      beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i.

      Rich, “Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.

      The calibration error statistic is a fixed value that does not covary.

      Well, if we knew what it was we would just treat it as a bias and subtract it off.

      And if the error is unpredictably variable? And if there is no error information about predicted future states?

      • 1. Rich, “the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them.”

        It doesn’t, actually. It just shows that scientists will use a method that yields a useful measure of reliability, even if the closed form axioms are not known to be sustained.
        See Vasquez and Whiting, “When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:
        beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i.”

        Answer: that is how you get junk science, when scientists use a method when its axioms and conditions are not known to be sustained. I’m not saying that your science is junk, but as of now I’m not saying that it’s not either.

        2. Rich, “Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.”
        The calibration error statistic is a fixed value that does not covary.

        Answer: if it is a fixed value then not only does it not have covariance amongst samplings of itself, it also does not have variance. Therefore it is an unknown fixed value, and its uncertainty intervals must be added like Tim Gorman’s single ruler example, not added “in quadrature”.

        3. “Well, if we knew what it was we would just treat it as a bias and subtract it off. ”
        And if the error is unpredictably variable? And if there is no error information about predicted future states?

        Answer: but you just said that it is a fixed value, so how can it also be unpredictably variable? Which day of the week is it now, and in which Wonderland are we residing?

  78. Pat Frank’s (2019) abstract points attention to substantial errors within CMIP5 climate models in simulating the global cloud fraction. It then claims:

    The resulting long-wave cloud forcing (LWCF) error introduces an annual average ±4 Wm–2 uncertainty into the simulated tropospheric thermal energy flux.

    What the cited source for this uncertainty, Lauer and Hamilton (2013) asserts, however, is stunningly different, both physically and statistically:

    The CF is defined as the difference between ToA all-sky and clear-sky outgoing radiation in the solar spectral range (SCF) and in the thermal spectral range (LCF)…

    The rmse of the multimodel mean for SCF is 8 W m−2 in both CMIP3 and CMIP5….For CMIP5, the correlation of the multimodel mean LCF is 0.93 (rmse = 4 W m−2) and ranges between 0.70 and 0.92 (rmse = 4–11 W m−2) for the individual models.

    Clearly, L&H’s empirical error estimates (based upon a 20-yr comparison of model simulations and satellite measurements) pertains strictly to ToA power flux densities. It says little about the troposphere. Moreover, the empirical error is always specified as a fixed rms value. This standard error specification is necessarily in the same units as the signal itself and applies indefinitely; it cannot be projected per annum as Frank would have it. The fact that annual average data were used in the comparison implies only that there’s no involvement of seasonal and diurnal cycles in the given error estimates. There’s simply no pertinent annual rate of change here, such as with the “CO2-forcing” signal.

    Nor does the specification of standard error for annually averaged data imply anything about the one-step autocorrelation of that error and its predictability at any step. Contrary to assumption here, the “prediction error” of the models, in any rigorous sense, is not involved at all. Later this week I’ll demonstrate how Frank’s own analysis of one-step autocorrelation of error totally undermines his projection of how it propagates.

    • All true
      “What the cited source for this uncertainty, Lauer and Hamilton (2013) asserts, however, is stunningly different, both physically and statistically”
      It isn’t even the uncertainty of a global (spatial) average. It seems clear from the description that he computes the spatial correlation, ie between GCM/obs values at grid points. That is also the focus of Taylor’s paper, which Lauer seems to be following. That correlation is then converted to the 4 W/m2. It is the uncertainty over space, not time.

      • What is –

        ∓cloud-cover-unit year-1 x Wm-2/(cloud-cover-unit)

        Please show where the math in Supplemental Section 6.3 is incorrect.

        The distance a car travels is a spatial sum. Yet you can calculate the miles/year value for the car thus turning a spatial calculation into one with a time relationship. Why can’t you do the same for cloud cover?

        • “Please show where the math in Supplemental Section 6.3 is incorrect.”
          S6.2 is clearly incorrect. He sums a set of discrepancies e_{i,g} in g over n, “where “n” is the number of simulation-observation pairs evaluated at grid-point “g” across the 20-year calibration period.”. But then, to get the average, he divides by 20 years. There is no basis for that. And it’s where is year⁻¹ unit nonsense comes from.

          • “the annual mean simulation error at grid-point g, calculated over 20 years of observation and simulation, is”

            “where “n” is the number of simulation-observation pairs evaluated at grid-point “g” across the 20-year calibration period. Individual grid-point error ei,g is of dimension”

            You’ve got a 20 year value for cloud-cover so you divide by 20 years to get an annual value. “20 year mean” means something.

            I simply don’t see what is so difficult about that.

          • “I simply don’t see what is so difficult about that.”
            It’s just nonsense. To get an average, you sum and then divide by the number of things summed. By n, not by a fixed 20 years. If you had twice as many observations, that wouldn’t necessarily change the average. But with that formula, since you divide by a fixed 20, it would double it.

          • Nick, “If you had twice as many observations,…

            Observations of what magnitude error, Nick? Increasing the number of observations may reduce the average.

            Once again you make a simplistic mistake as soon as the arena is science.

          • Nick, “But then, to get the average, he divides by 20 years. There is no basis for that.

            From Lauer and Hamilton, “An analysis of the spread in the 20-yr-mean LWP among the ensemble members of individual models. …

            Figure 1 shows the 20-yr annual mean liquid water path averaged over the years 1986–2005 from 24 CMIP5 models …

            FIG. 1. The 20-yr average LWP (1986–2005) from the CMIP5 historical model runs and the multimodel mean

            A measure of the performance of the CMIP model ensemble in reproducing observed mean cloud properties is obtained by calculating the differences in modeled (xmod) and observed (xobs) 20-yr means (my bolding)”

            Twenty-year model and observational means are described throughout Lauer and Hamilton.

            But you already knew that, didn’t you Nick.

            Your “no basis” must have been an innocent mistake, mustn’t it.

          • “Twenty-year model and observational means”
            This is just idiotic. To get a mean or average, you divide a sum of things by the number of things. Not by 20 years because someone mentioned that they compiled the things (differences) over a twenty year period.

            Here is the section of the SI with Eq 6.2. The point is of some importance because it is where the year⁻¹ nonsense enters.

            “Increasing the number of observations may reduce the average.”
            The mean of observations is used as a population mean estimate. So there is no basis for that. In any case, this is yet another case of you and your defenders trying to excuse something that is mathematically wrong by saying that, well, you can’t tell, it just might come out all right.

          • ‘”The mean of observations is used as a population mean estimate. So there is no basis for that. In any case, this is yet another case of you and your defenders trying to excuse something that is mathematically wrong by saying that, well, you can’t tell, it just might come out all right.”

            Any additions of observations that shift the population makeup can affect the population mean, either positively or negatively. Only if the additional observations equal the mean is there no change in the mean.

      • Nick, “It isn’t even the uncertainty of a global (spatial) average.

        From Lauer and Hamilton, “In both CMIP3 and CMIP5, the large intermodal spread and biases in CA and LWP contrast strikingly with a much smaller spread and better agreement of global average SCF and LCF with observations. The SCF and LCF directly affect the global mean radiative balance of the earth, so it is reasonable to suppose that modelers have focused on ‘‘tuning’’ their results to reproduce aspects of SCF and LCF as the global energy balance is of crucial importance for long climate integrations.

        Further, “FIG. 7. Biases in simulated 20-yr-mean LWP from (left) the (top to bottom) four individual coupled CMIP5 models and (middle) their AMIP counterparts, with the smallest global average rmse in LWP. (right) The biases in annual mean SST in the coupled runs(my bold)”

        Wrong again, Nick. But not deliberately so. Not at all.

        • The RMSE is the RMS of discrepancies over space, not time. Here (from here) is what the author, Lauer, had to say:

          “I have contacted Axel Lauer of the cited paper (Lauer and Hamilton, 2013) to make sure I am correct on this point and he told me via email that “The RMSE we calculated for the multi-model mean longwave cloud forcing in our 2013 paper is the RMSE of the average *geographical* pattern. This has nothing to do with an error estimate for the global mean value on a particular time scale.”.”

          • “The RMSE is the RMS of discrepancies over space, not time. Here (from here) is what the author, Lauer, had to say:”

            I believe I already asked this. A car’s odometer has an uncertainty associated with it and the odometer measures a quantity in space and not time. Yet we can certainly take the increase of the odometer over the period of a year and call it miles/year driven, a measure in time. The uncertainty in space then becomes an uncertainty in time. The uncertainty doesn’t just disappear or go away because you are now using a measure in time. The uncertainty over the measure of a mile by the odometer adds for every mile measured, just like the uncertainty of a measurement made by the ruler that is 12″ +/- 1″ adds with every iteration of it being used to measure something longer than 12″. When that uncertainty in the total of the odometer over a year is evaluated at the end of the year the uncertainty doesn’t just disappear.

            Why is this concept so damnably difficult for you to grasp?

            Do you disagree? Do you *really* think that a measure of miles/year has no uncertainty.

          • “Why is this concept so damnably difficult for you to grasp?”

            It’s nothing like that. Lauer sets it out clearly and it’s even reflected in S6. There is a grid of points on Earth. At each, at various times, they have coincident values from GCM and observation. They get a correlation coefficient, and from that deduce the 4 W/m2. They haven’t formed a spatial average. The correlation doesn’t relate to any particular period of time, as Lauer said.

            Why is it so hard to grasp that if you are going to base a whole theory of failure of GCMs on that one number, 4 W.m2, you need to know what it actually is?

          • “Why is it so hard to grasp that if you are going to base a whole theory of failure of GCMs on that one number, 4 W.m2, you need to know what it actually is?”

            In other words you still don’t believe that the number of miles driven in a car (a spatial scalar) over the period of a year can become miles/year. Got it.

          • Nick, “They get a correlation coefficient, and from that deduce the 4 W/m2.

            From Lauer and Hamilton, page 3831: “A measure of the performance of the CMIP model ensemble in reproducing observed mean cloud properties is obtained by calculating the differences in modeled (x_mod) and observed (x_obs) 20-yr means.

            From page 3833: “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means. (my bold)”

            Page 3842, “Our analysis of the root-mean-square error of simulated LWP, CA, SCF, and LCF supports our findings on little to no changes in the skill of reproducing the observed LWP and CA.

            It’s very clear that the rmse in LWCF is derived from the geographical distribution of (simulation minus observation) errors over a 20 year calibration time.

            The correlations describe the linear coherence between simulated cloud properties and observed cloud properties.

            Correlations are *not* used to deduce the 4 W/m2.

            Nick either did not read Lauer and Hamilton, or has seriously deficient reading skills. Or?

          • “Correlations are *not* used to deduce the 4 W/m2.”
            They are. The very brief reference to that figure you have based your analysis on says:
            “For CMIP5, the correlation of the multimodel mean LCF is 0.93 (rmse = 4 Wm⁻²)”
            and they explain how they get that (p 3833). They form a polar plot – a Taylor diagram, which plots sd against correlation (theta). Then they deduce rmse as proportional to a linear distance on this plot.

            “It’s very clear that the rmse in LWCF is derived from the geographical distribution of (simulation minus observation) errors over a 20 year calibration time.”
            Yes, it is. But that is the rmse associated with variation at each grid-point, as measured at points with coincident obs/GCM values during a 20 year period. It is not the variation of a global spatial average, as you are treating it. It does not represent a variability you can associate with a forcing. In fact, it will be much attenuated once you do form a global average. And it has no per year status, as Lauer said.

    • “Later this week I’ll demonstrate how Frank’s own analysis of one-step autocorrelation of error totally undermines his projection of how it propagates.”

      And again we see the confusion of error with uncertainty.

    • 1sky1, “it cannot be projected per annum as Frank would have it.

      L&H page 3833, “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. (my bold)”

      Are you going to argue, as Nick Stokes does 1sky1, that “annual mean” does not mean annual mean? Is object incoherence a practice of art for you, as it is for Nick?

      1sky1, “Nor does the specification of standard error for annually averaged data imply anything about the one-step autocorrelation of that error and its predictability at any step.“(my bold).

      So, it’s all annually averaged after all, including annual mean uncertainty. Uncertainty, not error, 1sky1. Arguing error is a fatal mistake. Uncertainty due to calibration error stemming from model theory-error necessarily appears in every simulation step of a futures projection.

      1sky1, “Clearly, L&H’s empirical error estimates … pertains strictly to ToA power flux densities. It says little about the troposphere.

      You don’t stand a chance of making that case, 1sky1. And you’d not even try unless you’re an adherent of Stokesism.

      From Hartmann, et al, below: “The largest contribution to net cloud forcing are provided by low clouds, especially in the tropical stratus cloud regions and the summer hemisphere(Fig. 21). Low clouds are abundant, and act to reduce the radiation balance by reflecting solar radiation. High thick clouds also provide significant reductions to the radiation balance, since they reflect more solar radiation than they trap longwave emission. High- and middle-level clouds with relatively low optical depth do not enter very strongly into the regression equations for net radiation. High thin clouds make a small positive contribution to the net radiation in tropical latitudes — amounting to about 5 W m^-2 in zonal average. (Fig 22).

      Quote from p. 1299 of Hartmann, et al., (1992) The Effect of Cloud Type on Earth’s Energy Balance: Global Analysis J Climate. 5(11), 1281-1304.

      That quote is just one of what could be many from Hartmann, where they discuss the impact of clouds on the thermal energy flux of the troposphere.

      See also Figure 7.1 of the IPCC 5AR for a graphical rendering of the impact of longwave cloud forcing on global tropospheric air temperature.

      From Figure 7.1 legend: “Overview of forcing and feedback pathways involving greenhouse gases, aerosols and clouds. … Feedback loops, which are ultimately rooted in changes ensuing from changes in the surface temperature, are represented by curving arrows (blue denotes cloud feedbacks;…)(my bold)

      Surrounding text includes,”Figure 7.1 illustrates key aspects of how clouds and aerosols contribute to climate change, and provides an overview of important terminological distinctions. … Rapid adjustments (sometimes called rapid responses) arise when forcing agents, by altering flows of energy internal to the system, affect cloud cover or other components of the climate system and thereby alter the global budget indirectly. … As shown in Figure 7.1, adjustments can occur through geographic temperature variations, lapse rate changes, cloud changes and vegetation effects. (my bold)”

      See also Figure 7.3, wherein, “the brown arrow symbolizes the importance of couplings between the surface and the cloud layer for rapid adjustments. (my bold)”

      AR5, under 7.2.1.2, “The net global mean CRE (cloud radiative effect) of approximately –20 W m^–2 implies a net cooling effect of the clouds on the current climate. Owing to the large magnitudes of the SWCRE (short wave cloud radiative effect) and LWCRE (long wave cloud radiative effect), clouds have the potential to cause significant climate feedback (Section 7.2.5).

      5AR under 7.2.5: “Until very recently cloud feedbacks have been diagnosed in models by differencing cloud radiative effects in doubled CO2 and control climates, normalized by the change in global mean surface temperature. … Moreover, it is now recognized that some of the cloud changes are induced directly by the atmospheric radiative effects of CO2 independently of surface warming, and are therefore rapid adjustments rather than feedbacks (Section 7.2.5.6).

      Not a chance, 1sky1.

      • So, it’s all annually averaged after all, including annual mean uncertainty. Uncertainty, not error, 1sky1. Arguing error is a fatal mistake.

        When someone clings obsessively to the bizarre notion that straightforward algebraic averaging of any given time-series over yearly intervals introduces an “annual mean uncertainty,” thereby changing the dimensions of the data, then there’s no chance of any rational discussion.

        • 1sky1, I agree, but the only way to demonstrate error , that is scientific and logical error, in order to further the rational discussion, is to use mathematics to dispute what was asserted. If you could do that it would be great. I might get around to it, but I am frying other fish right now.

          Rich.

          • With only minutes to spare in my daily schedule for blog activity, I must leave the demonstration of the mistaken treatment of modeling uncertainty to others. Fortunately, there is another blog that has taken up this murky issue competently, offering Monte Carlo simulations of both the systematic forcing error and the putative error arising from Frank’s mistaken assumption of a random-walk type of yearly error propagation. See: https://andthentheresphysics.files.wordpress.com/2019/09/uncertainchangeforcing-1.png

          • There is no such, Frank’s mistaken assumption of a random-walk type of yearly error propagation.

            Uncertainty is not about error specification.

            It’s about reliability in a predicted result where no knowledge of error is available.

            Uncertainty analysis includes no assumption about the structure of error.

            You folks are consistently wrong.

          • Pat,

            They keep saying your analysis doesn’t work properly for determining the error in the output of the GCM’s. I simply cannot believe that it is so hard to understand the difference between uncertainty and error. If the uncertainty is greater than the change trying to be recognized then you simply don’t know if the change is really true. It’s the same as calculating out to the hundreths when your measuring device only resolves to the tenths! Mathemeticians and computer programmers seem to have no concept of significant digits or uncertainty.

          • Uncertainty is not about error specification. It’s about reliability in a predicted result where no knowledge of error is available.

            If “no knowledge of error is available,” from where does the 4 W/m^2, whose square you accumulate annually, come? From divine imagination, just like your notion that a process whose variance increases linearly with time is not a Wiener process–a 1-D version of a random walk?

        • I changed no dimensions, 1sky1.

          It’s not “any given time-series” It’s an error series.

          The LWCF rmse is the mean annual standard deviation of error from a 20-year calibration-of-simulation experiment. It generally characterizes the resolution of CMIP5 simulations of TCF.

          If you’re going to obsessively assert what is not present, then your conclusion is true.

          • No matter what the nature of the series, the mathematical operation of averaging data in barely non-overlapping intervals cannot change the dimensions of the series. The rmse of ToA LW emissions reported by Lauer and Hamilton is the standard deviation of the all-model aggregate average of the individual 20-yr model-error series. Because there is autocorrelation in those series, that deviation is definitely NOT the “mean annual standard deviation” of the individual years, as you have it.

        • Here is something from that other discussion:

          “As Gavin Schmidt pointed out when this idea first surfaced in 2008, it’s like assuming that if a clock is off by about a minute today, that tomorrow it will be off by two minutes, and in a year off by 365 minutes. In reality, the errors over a long time are completely unconnected with the offset today.”

          If the clock was correct yesterday and off by a minute today then it is *NOT* an offset bias that caused the difference. It is an in-built error in the time-keeping mechanism. If that error mechanism is unknown then there is an uncertainty in the output of the clock at any point in time. That uncertainty will grow as the actual error displayed by the clock grows since most clocks don’t randomly gain and lose time. If the actual growth in the error over time can be determined then the uncertainty growth can be determined as well by growing it over time.

  79. “again we see the confusion of error with uncertainty”
    Just dumb parroting. From the title:
    “Propagation of Error and…”
    From the abstract:
    “Linear projections are subject to linear propagation of error”
    From the key-words
    “propagated error,”
    From the intro
    “Propagating physical errors through a model is standard”
    etc etc

    • “Just dumb parroting. From the title:”

      I’ll admit some of the wording is perhaps confusing. But the math isn’t.

      You still can’t get over the fact that you can’t eliminate uncertainty by assuming it is a random variable. Get over it! The example of a 12″ +/- 1″ ruler being used to measure the width of a room should have been enough of an example for you to figure this out. The uncertainty grows with every iteration of its use. You can’t “average” it away! It’s the same for the climate models, you can’t “average” the uncertainty away. It grows with each iteration. The only way to lower the overall uncertainty is to make the models more accurate, i.e. lower that initial uncertainty interval.

    • As usual, you display no knowledge of science, Nick.

      Propagation of error through a calculation yields uncertainty in a result.

      It doesn’t matter that I posted a large selection of published literature showing that to be the case.

      You’ll evidently continue in denial anyway. Science denial. What a concept, hey?

      • Pat, Nick is not denying science and mathematics, merely your unquestioning use of aspects of it. Here are 3 ways to accumulate uncertainty.

        1. Add up the intervals. That is correct in the case of Tim Gorman’s single ruler used many times.

        2. Add up the inferred variances and take the square root (“adding in quadrature”). That is correct in the case of using many rulers, under the assumption at least that no pair of them were cut from a 24+/-0.01″ piece of wood. If that assumption is violated then there is severe anti-correlation.

        3. Add up the inferred variances and covariances weighted by the appropriate mathematics which you have cited. This is the correct way, but reduces to No. 2 if it turns out all the covariances are statistically indistinguishable from 0. As far as I can tell you haven’t looked at a relevant covariance matrix so you can’t tell.

        • A single, mis-scaled ruler used consistently over time introduces only a static scaling error, which may be unknown, but is easy to spot when accurate rulers are available. Even in the absence of such rulers, if the error-statistics of the production run of faulty rulers are known, then measurements made by a large-enough set of such rulers provide useful results via their aggregate average. That’s the principle that GCMs, which provide no bona predictions of actual climate, rely upon in their simulations of “climate scenarios.”

          Contrary to Frank’s claim of “no covariance,” straightforward scaling error is always perfectly coherent at all frequencies with the underlying signal. That signal is known to be very significantly autocorrelated , which totally invalidates his presumption of additivity of stochastically independent variances.

    • There’s no confusion of terms or in wording, Tim.

      Suppose you measure some quantities a; b; c; … with uncertainties ∂a; ∂b; ∂c; … . Now you want to calculate some other quantity Q which depends on a and b and so forth. What is the uncertainty in Q?Harvard physics 86 kb pdf.

      That reference won’t stop Nick from endlessly repeating his distruth, but at least the reference is here to demonstrate it.

      • Pat,

        “There’s no confusion of terms or in wording, Tim.”

        Not to me anyway. To someone dead set on finding *anything* to stick to the wall it’s any shelter in the storm.

  80. Nick Stokes: So what is the basis for adding the uᵢ in quadrature, in Eq 6, if it isn’t an εKε calculation?

    That would be an interesting question if anything else could be agreed. Outstanding disagreements seem to revolve around.

    1 In the GCMs, the relationship between CO2 forcing input and temperature output is linear. Nick Stokes asked how that was possible if the GCMs are not accurate models of climate [hope I did not lose anything important in the paraphrase.] That the relationship is linear to a high degree of approximation is supported by analysis; and in turn supports Pat Frank’s uncertainty propagation.

    2. Pat Frank’s analytic procedure does not create a random walk with independent sampling of error in each year. The only random variation is in the estimate of the parameter. There is additional uncertainty in the estimate of the standard deviation of that parameter estimate (hence the endpoint of the CI), but no evidence that the standard error of the parameter is small.

    3. Representation of the uncertainty in the parameter estimate by a probability model instead of a fixed-width interval (for which the propagation of uncertainty produces a larger spread in the uncertainty of the final projected value.

    4. In professional as well as common usage, the word “error” can refer to each of: (1) a particular error value; (2) the uncertainty in the value of an error that is likely present.

    As far as I can tell, the critics of Pat Frank are unwilling to admit both that the CO2-Temp relation of the GCMs is highly linear and (b) that relationship can be useful in subsequent calculations.

    • “That would be an interesting question if anything else could be agreed”
      How does it depend on that. The statement is there, in the paper. There must be some justification for it. Well, maybe.

      “if the GCMs are not accurate models of climate”
      No, I said how is it possible if the GCMs are so mired in uncertainty, that any cohereht modelling could emulate them.

      ” uncertainty in the parameter estimate by a probability model”
      So what is the difference between that and saying it is random?

      “are unwilling to admit both that the CO2-Temp relation of the GCMs is highly linear”
      Not at all. As I said here, a generally linear dependence of ΔT on ΔF is to be expected, and was the basis of much 1-D modelling, up to Manabe and Weatherald and beyond, who did it far better than here. And it is expected to be strongly present in the small part of GCM output related to global surface temperature.

      ” that relationship can be useful in subsequent calculations.”
      It can be useful in analysing Eq 1. It has nothing to do with error propagation in a GCM.

      An example of why. Suppose you have a GCM, and also a simple model of the Earth at which TOA outward is fixed at 1361 W/m2. Since that currently is about right, the simple model may well give right answers. But how will the two models respond to uncertainty about insolation? What if it increases? The GCM will come into balance; the simple model will just keep accumulating heat.

      • Nick wrote, “No, I said how is it possible if the GCMs are so mired in uncertainty, that any cohereht modelling could emulate them.

        Showing ignorance of the difference between error and uncertainty.

        Nick, “a generally linear dependence of ΔT on ΔF is to be expected,

        It appears the trauma of Jerry Browning has had its effect on you.

        Nick, “The GCM will come into balance; the simple model will just keep accumulating heat.

        Eqn. 1 isn’t “a simple model of the Earth.” It’s a model demonstrating that GCM air temperature projections are simple extrapolations.

        Also, for the gazillionth time, uncertainty is not error.

        A point that also combines to invalidate, “It has nothing to do with error propagation in a GCM

      • Hi Nick,

        A while back on this thread, I inquired if you had worked with DEs in the area of finance, to which you kindly provided references to CSIRO’s development of plug-in models for FENICS (ref. Nick Stokes September 20, 2019 at 12:06 am). I meant to respond at the time, if only to say that you are clearly a very smart person, whose math and modeling skills are vastly beyond mine. Subsequently, I have thought further about the precision vs. accuracy conundrum, and would like to run the following analogy between GCMs and derivative pricing models (DPMs) by you for comment:

        Consider a universe of major money-center banks, each of which maintains proprietary DPMs for pricing long-dated interest rate options out to 10 years, or so. While independent, these DPMs are all based on similar references to the academic literature and are calibrated in real-time to reflect current market interest rates (e.g., cash Libors, ED futures, swap rates, Treasury bond yields, etc.), as well as the volatilities of observable instruments. In other words, they are internally consistent with forward rates and volatilities. Presuming that one could obtain concurrent pricing on a strip of forward interest rate options (out to 10 years) from each of the banks, what observations could be made about the DPMs’ 1) precision and 2) accuracy?

        Based on my own experience (admittedly limited and dated), I’d posit that the precision of the GCMs would be high, and that price variations among the various banks would be minimal, maybe on the order of a few “bips”. In contrast, the accuracy of the DPMs would be low – I can’t think of any rational person or financial institution that would hold a sizable unhedged short position in these options for any amount of time, neither overnight or a week, let alone anything on the order of years. This is not to say that we lack expertise in how to model interest rate derivatives in a “risk neutral” world, we just have no way of knowing how economic / financial conditions will evolve over time in the real world based on any information we can obtain from current financial market data.

        To extend the analogy a bit further, the issue I have with GCMs is not that bright people like yourself have contributed to their current state of development, it’s that so-called “policy makers” are using the GCMs’ results to insist that we cede our personal liberties to them concurrent with massively shorting our current long position in reliable energy sources. While this might work out well for a fortunate few, I think it would be disastrous for most of us.

        Thank you.

        • Frank,
          “what observations could be made about the DPMs’ 1) precision and 2) accuracy?”
          Personally, I think much nonsense is talked about precision and accuracy. Accuracy is supposed to be discrepancy between measurement and truth, but folks who push this also tend to insist that every measure has uncertainty. IOW we can never know “truth” and so can never determine accuracy. All we can do is compare measures with what we believe to be better measures.

          This is particularly true of DPMs. They assign a value to a probability proposition, but you can never test that by any perfect measure, or even a good one. The only test would be market value (when?) but that only tests relative to others’ estimate (probably using similar software). In fact, a lot of the early use of our software was in providing a second opinion (because it was somewhat different).

          I don’t think there is a much useful analogy between DPMs and GCMs for error propagation. DPMs are basically heat equation, and fairly stable, although people try to push them into unstable situations. GCMs have wave-like solutions which are very important for error propagation. They run for long basically steady periods, whereas DPM propositions typically have defined periods.

          As to GCMs, personal liberties etc, I think of course that is nonsense. We are in a situation where we are making big changes to the atmosphere. That will have effects; that has been known since Arrhenius in 1896. GCMs are our best way of quantifying the future effects, but they didn’t originate the concern about them. Deciding not to try to quantify them at all is not an adequate answer.

          • Nick, “That will have effects; that has been known since Arrhenius in 1896.

            Arrhenius had no adequate physical theory of climate to make such a determination and neither do you, Nick.

            Nor anyone else.

            Meanwhile, the climate shows no unusual behavior.

      • Nick Stokes: “if the GCMs are not accurate models of climate”
        No, I said how is it possible if the GCMs are so mired in uncertainty, that any cohereht modelling could emulate them.

        ” uncertainty in the parameter estimate by a probability model”
        So what is the difference between that and saying it is random?

        First, picking up on your comment that the A matrix is known, the question is “about what is there uncertainty?” We are certain what parameter value is used in the modeling, but we are uncertain about the relationship between the used number (based on empirical evidence) and the “true” number. It’s the same question as with the early estimates of the gravitational constant in Newton’s inverse-square law: the first estimate was known as soon as it was calculated; the uncertainty revolved around its closeness to the true value.

        Second, I touched earlier on the use of the same mathematical theory of probability to represent relative frequencies of occurrence (“aleatory”) and confidence/uncertainty in the truth of propositions (“epistemic”) and the relation ship between them. It’s “obvious” that the random elements in the data induce uncertainty in the parameter estimates computed from them. Qunatifying the relationship between the random variation and the uncertainty is done two principal ways: through Bayes’ Theorem to calculate “credible intervals” for the parameter from prior belief/uncertainty (before data collection) and the random variation in the data; standard errors of the parameter estimate yielding confidence intervals and confidence distributions. They have been much studied. A good recent book, somewhat technical, is “Computer Age Statistical Inference” by Bradley Efron and Trevor Hastie. I also recommend “Confidence, Likelihood, Probability: Statistical Inference with Confidence Distributions” by Tore Schweder and Nils Lid Hjort. A formal limit theorem states that with a sufficiently large sample the credibility distribution and confidence distribution are practically indistinguishable. Not everyone agrees that the “epistemic” probabilities have been shown to be as useful as the “aleatory” probabilities.

        • “Not everyone agrees that the “epistemic” probabilities have been shown to be as useful as the “aleatory” probabilities.”
          Some fancy philosophy here. But you are not dealing with the very basic questions of why the error should be compounded in quadrature. Or why it should be compounded at all. Or the basic mathematical howlers, like eq 5.1, which makes no sense either as a change of whole temperature, as you noted, or as difference, which was your alternative (repudiated) scenario. Or of the junior high school level error in S6.2. Or of the units, which are treated with no consistency, but can’t work however you do it.

          And, of course, the very basic question of how analysing Eq 1 helps with propagation of error in a GCM.

          • Nick Stokes: But you are not dealing with the very basic questions of why the error should be compounded in quadrature. Or why it should be compounded at all.

            That is not a “basic” question.

  81. The muddling of the meaning of the empirical estimate of the rms error in modelling ToA planetary LW emissions (see my comment yesterday ) is compounded by “propagating” this error as if it accumulated in annual steps. The assumption is made that it’s a systematic specification error, analogous to a mis-scaled ruler used for measurement, with a different ruler used every year in a whole chain of mis-measurements.

    But rms error specifies only the variability—not the bias–of model output. And no GCM operates by randomly changing its computational algorithms—no matter how unrealistic–in annual, or any other, steps. Nor do GCMs, run in climate simulation mode, make any calculations based upon assimilating data about the actual state of climate. They run the same algorithms with the same set parameters as in the final “calibration” phase. Their only tie to real time lies in the historic specification of “forcing.” The error of such model simulations is entirely unrelated to those of bona fide time-series predictors, which indeed grows with time to an asymptotic value set by total signal auto-covariance. Contrary to assumption here, it tends to remain stable, albeit in a complicated way that requires deeper insight into model specifics.

    Frank examines the structure of CMIP5 multi-model-mean total cloud fraction error over a 25-year period, concluding that the “highly autocorrelated lag-1 error (R = 0.97) implies that systematic cloud effects remain in the error residual. This in turn indicates that the CSIRO GCM systematically misrepresented the terrestrial cloud cover.”

    While modelling errors reaching beyond 10% are certainly disturbing, what strong lag-1 autocorrelation in a short error-record doesn’t resolve, however, is whether this is due to a truly deterministic system-specification bias or simply the result of failing to model random climate variations with longer periods, such as the ~60yr-oscillation evident in many indices–or perhaps even energetic, low-frequency “red noise.” In any event, it points strongly to the conclusion that intimately-related errors in modeled LW emissions and surface temperatures are NOT stochastically independent year to year. Thus whatever modeling error there may be in the long run, it CANNOT be legitimately propagated as the square root of the cumulative sum of the variance of average yearly errors.

    The truly dismal feature of the presentation here is how many fail to realize that stochastic independence at each step is mathematically necessary for Frank’s error formulation to hold.

    • An excellent comment that gets close to the heart of the problem.

      Dr Franks is making the mistake of believing that there is a one-suit-fits-all recipe for uncertainty propagation from a calibration error, irrespective of the physical system under study. One simple test to run is to assume that the calibration error in LW cloud forcing is made up of a systemic bias error and an annually imposed random error independently drawn each year from a pdf. It is easily shown that neither type of error translates into the temperature series as an integrating error.

      The lag-1 autocorrelation in “LW cloud forcing” is actually expected and predictable from the physics/mathematics of the problem and arises from the relationship between cloud feedback and temperature. It does not imply that LW cloud forcing itself varies as an integrated series; even less does it imply that the total forcing propagates uncertainty as an integrated series. I will try to explain this further in another post.

      • Unfortunately, the “one-suit-fits-all recipe” for solving very difficult real-world problems is a common academic boondoggle. Instead of admitting any scientific inability, attention is often directed to some aspect of the problem for which the solution is well known.
        That is egregiously the case in “climate science,” which places excessive emphasis on idealized treatment of purely radiative processes in the atmosphere, while soft-pedaling the dominant mechanism of moist convection in transferring heat from the aqueous planetary surface. The ensuing cloud formation not only influences LW fluxes, but strongly modulates the surface insolation–the real exogenous forcing of the system.

        To keep matters in realistic perspective, please note that the lag-1 autocorrelation referred to here pertains not to any “LW cloud forcing,” per se, but to the cloud fraction, usually reported in oktas.

        • 1sky1, “Unfortunately, the “one-suit-fits-all recipe” for solving very difficult real-world problems is a common academic boondoggle.

          Another inadvertent descent into ironic humor.

          Supposing heavily parameterized engineering models are predictive well beyond their calibration bounds.

      • kribaez, “An excellent comment that gets close to the heart of the problem.

        Yeah. The heart of the problem is that none of you know anything about uncertainty analysis.

        Dr Franks ….

        Frank, not Franks. But in any case, please call me Pat

        … is making the mistake of believing that there is a one-suit-fits-all recipe for uncertainty propagation from a calibration error, irrespective of the physical system under study.

        That’s truly funny, kribaez. Linear sums always follow linear propagation of error, no matter the physical system under study.

        GCM air temperature projections are linear sums of linearly extrapolated fractional GHG forcing. That’s a qed.

        One simple test to run is to assume that the calibration error in LW cloud forcing is made up of a systemic bias error and an annually imposed random error independently drawn each year from a pdf. It is easily shown that neither type of error translates into the temperature series as an integrating error.

        The usual, refractory, and very tedious mistake of thinking uncertainty is error.

        You wrote, “The lag-1 autocorrelation … arises from the relationship between cloud feedback and temperature.

        The lag-1 autocorrelation is in the error due to the between observed and simulated CF. It doesn’t matter how it arises. It reveals the structure of the simulation error.

        You wrote, “It does not imply that LW cloud forcing itself varies as an integrated series; …

        No, it implies that GCMS do not simulate cloud fraction correctly and that the error includes a linearly deterministic residual.

        … even less does it imply that the total forcing propagates uncertainty as an integrated series.

        Irrelevant. The lag-1 auto-correlation is never taken to imply that.

        I will try to explain this further in another post.

        Let’s see: that will be a further explanation of an argument starting from a mistake and extending into an irrelevance.

    • 1sky1, “The muddling of the meaning of the empirical estimate of the rms error in modelling ToA planetary LW emissions…

      There was no such muddling. Evidence posted here.

      1sky1, “The assumption is made that it’s a systematic specification error, analogous to a mis-scaled ruler used for measurement, with a different ruler used every year in a whole chain of mis-measurements.

      Your description is not correct.

      The analogy might be the *same* mis-scaled ruler used to make whole a series of measurements. That ruler is from a production run of rulers that are known to be mis-scaled. However, the error in any given ruler is unknown.

      All you have is a calibration average of measurement uncertainty for that production run of rulers. You then use one of them. You have no idea of its specific measurement error. You only have the average calibration error statistic for the production run.

      Then you take a series of measurements using that ruler of unknown specific measurement error, and sum them all up into a final total.

      The uncertainty is present in every single measurement. If the uncertainty is (+/-)1 mm in measurement 1, then it is (+/-)1.41 mm in the sum of measurement 1+ measurement 2. And so it goes.

      The uncertainty in the summed total is the rss of the uncertainty in every individual measurement made. That is error propagation. It is done using the calibration error statistic of the production run.

      A much more involved problem, and a much worse problem than you allow, 1sky1.

      You wrote, “But rms error specifies only the variability—not the bias–of model output.

      No, it does not.

      The rms calibration error specifies the difference between simulation and observation: accuracy. The variability of model output is simulation 1 minus simulation 2: precision.

      You have made the fatal mistake that is standard among climate modelers. You folks exhibit no understanding of physical error analysis.

      You wrote, “And no GCM operates by randomly changing its computational algorithms…

      Not an assumption in my work, nor an assumption in determining calibration error.

      You’re going further and further afield into mistaken territory, 1sky1.

      You wrote, “Nor do GCMs, run in climate simulation mode, make any calculations based upon assimilating data about the actual state of climate.

      They’re merely based upon parameters derived from assimilated data about the actual state of the climate over the calibration period.

      And those parameters, which produce wrong simulations in the calibration period, are supposed to produce accurate simulations in a futures projection. Great thinking.

      You wrote, “They run the same algorithms with the same set parameters as in the final “calibration” phase.

      Perfect. And those parameters are found to produce simulation errors when compared with observations over the calibration period. The rms of those errors across models and across the calibration period establish a lower limit of model resolution.

      You wrote, “The error of such model simulations is entirely unrelated to those of bona fide time-series predictors, which indeed grows with time to an asymptotic value set by total signal auto-covariance

      But growth of uncertainty is not about growth of error, 1sky1. It’s about the reliability of the expectation value.

      It’s about not knowing where the simulated climate state is with respect to the physically correct climate state in the climate phase-space.

      You wrote, “what strong lag-1 autocorrelation in a short error-record doesn’t resolve, however, is whether this is due to a truly deterministic system-specification bias or simply the result of failing to model random climate variations with longer periods, such as the ~60yr-oscillation evident in many indices–or perhaps even energetic, low-frequency “red noise.”

      Except that the error is a 25 year average per model. Random effects will have diminished by 5-fold.

      Further, we know that the average TCF error over all 27 models and 20 years = 540 model years, is (+/-)12.1 %, indicating the persistence of error through a large aggregate.

      One gets the same fractional CF simulation error whether averaging the per-model CF simulation error, or taking the difference between observed cloud fraction and the average simulated CF of the entire 12 model ensemble discussed in Jiang, et al., (2012) Evaluation of cloud and water vapor simulations in CMIP5 climate models using NASA “A-Train” satellite observations JGR 117, D14105; doi: 10.1029/2011jd017237.

      That is, the error fraction does not diminish when all the model simulations are averaged. Did the simulation errors have a random component, the fractional CF error would be reduced in the simulation average.

      Nor is it expected of random error to find large inter-model correlations of error in 25-year error means.

      You wrote, “the conclusion that intimately-related errors in modeled LW emissions and surface temperatures are NOT stochastically independent year to year.

      This conclusion follows your hand-waving dismissal of inter-model correlation of TCF error.

      Apart from that bit of negligence, the uncertainty analysis is not about error. You can’t know how error behaves in a futures projection. Your argument here is utterly irrelevant to an uncertainty analysis.

      You wrote, “Thus whatever modeling error there may be in the long run, it CANNOT be legitimately propagated as the square root of the cumulative sum of the variance of average yearly errors.

      Whatever modeling error there may be in the long run [in a futures projection] will always be unknown. There’s no point in talking about how unknown quantities behave.

      The calibration rmse is not an “average yearly error” as you have it. It is the average uncertainty to be expected across every single year of a simulation. Add up the years, the uncertainties combine as the rss.

      It’s standard analysis that invariably escapes climate modelers.

      You wrote, “The truly dismal feature of the presentation here is how many fail to realize that stochastic independence at each step is mathematically necessary for Frank’s error formulation to hold.

      Rather, the truly dismal failure here is how many who suppose they are scientists know nothing about physical calibration error analysis.

      My analysis is about uncertainty, not error. It presumes nothing about the structure of simulation error, including nothing about its stochastic independence.

      Uncertainty is about the reliability of a predicted result for which no specific error magnitude is available.

      Over and yet over again you miss that critically central point.

    • All you have is a calibration average of measurement uncertainty for that production run of rulers. You then use one of them. You have no idea of its specific measurement error. You only have the average calibration error statistic for the production run.

      Then you take a series of measurements using that ruler of unknown specific measurement error, and sum them all up into a final total.

      The uncertainty is present in every single measurement. If the uncertainty is (+/-)1 mm in measurement 1, then it is (+/-)1.41 mm in the sum of measurement 1+ measurement 2. And so it goes.

      The uncertainty in the summed total is the rss of the uncertainty in every individual measurement made. That is error propagation. It is done using the calibration error statistic of the production run.

      A much more involved problem, and a much worse problem than you allow, 1sky1.

      No tired repetition of basic misconceptions can provide any fitting explanation of actual uncertainty in GCM modelling of climate. The key misconception here is that the 4 W/m^2 rms error in the all-sky LW ToA emissions estimated for an aggregate average of model runs over 20 years of annually averaged data constitutes a static “calibration error,” one that compounds every year, simply because the data has been averaged yearly. That’s as ludicrous as claiming that a thermometer with a simple scaling factor of 0.98 that reports an average temperature of 93F for 3 minutes a hot summer day compounds its uncertainty every 3 minutes.

      The melange of similar misreadings and misdirections promoted here does not merit even a minute’s distraction from football viewing and Oktoberfest.

      • sky:

        “That’s as ludicrous as claiming that a thermometer with a simple scaling factor of 0.98 that reports an average temperature of 93F for 3 minutes a hot summer day compounds its uncertainty every 3 minutes.”

        We are actually being asked to believe that the thermometer can show the temperature as 93.00F. For it to happen that the temperature in the summer over a 3 minute interval with changing wind, humidity, and sun incidence is constant to a hundreth of a degree is proof itself that error is compounding somewhere in the measurement unit. It might be in the thermal inertia in the measurement housing and other infrastructure or in the thermal inertia of the measuring device (i.e. thermistor, etc) itself.

        Now, if you want to argue that our measurement infrastructure is only accurate to the nearest degreeF then heck, I’ll be right in there with you. But that assumption alone invalidates the outputs of the climate models in trying to forecast temperature differences in tenths or hundreths of a degree!

      • Totally missing the point about specious specification of the compounding interval of presumed scaling factor uncertainty in time-series modeling is no way to carry on a logical discussion. The accuracy of in situ measurements is not the issue here.

  82. Dr Franks,
    You wrote:-
    “The paper presented a GCM emulation equation expressing this linear relationship, along with extensive demonstrations of its unvarying success.

    In the paper, GCMs are treated as a black box. GHG forcing goes in, air temperature projections come out. These observables are the points at issue. What happens inside the black box is irrelevant.

    In the emulation equation of the paper, GHG forcing goes in and successfully emulated GCM air temperature projections come out. Just as they do in GCMs. In every case, GCM and emulation, air temperature is a linear extrapolation of GHG forcing.”

    No, what happens in the black box is not irrelevant.

    I was asked in a previous thread if I accepted your emulation model, and responded that I would accept it if you really had found a source of integrated error in forcing, but that I do not believe that you have. The most disturbing thing about your paper is that you seem to believe that there is a recipe for dealing with a calibration error which applies irrespective of the physical system being examined. My previous challenges to you on this subject have evidently not moved you from your position, so I believe that I am going to have to dissect your emulation model in a lot more detail, and I will do so in a follow-up post, but I would first like to offer you a simple analogy to demonstrate the importance of the physical model to uncertainty propagation.

    A building company wants to build 42 identical houses. One of the tasks is to measure and mark the locations of 60 rafters along a wall-plate which abuts a vertical wall. The specification is that the centre points of each rafter should be 41cms apart. The architect (who is not very bright) sets out instructions that the builders should carefully cut a batten measuring 41 cms long. They should then place an offcut from one of the rafters against the vertical wall to mark the edge of the first rafter. After that, they should lay the measuring batten against the first mark and then mark off the edge of the second rafter, and then repeat for the third, and so on.
    After the wall plates are marked off, and the first five rafters have been installed, a surveyor tests the location of the 3rd, 4th and 5th rafters for all 42 houses, and calls an emergency meeting with the site foreman. According to the surveyor’s calculations, the rafters from all 42 houses show uniformly distributed errors of (+/-)5mm. He assumes a Uniform distribution, with mean zero and variance 8.33. He also assumes that the architect’s instructions have been followed exactly, which means that he is dealing with an integrating series. Consequently, he calculates that the (additional) variance arising from the addition of a further 55 rafters will be 55 * 8.33, yielding a horrifyingly large sd of over 22mm – an expression of uncertainty in location of the final rafter. The foreman (who is a lot brighter than the architect) tells him not to worry. The error will not be more than (+/-) 5mm in the 60th rafter.

    The site foreman has been putting in rafters for many years, so he understands why it is truly stupid to mark off intervals in a way that integrates error, so he ignored the architect’s instructions. What his teams actually did was to lay a 30m rule along each wall plate, and mark off the points at 41cms, 82cms, 123cms, etc . The errors found by the surveyor comprised a bias error of upto (+/-) 4mm (because the end wall – the starting point for the measurements – is never a perfect plane) plus a marking error of (+/-)1mm. By marking off the cumulative distances, the foreman had eliminated the integrating error.

    The point of this parable is that you cannot dissociate uncertainty propagation from the physical problem involved. The surveyor’s calculations were reasonable for his assumptions, but he was trying to solve the wrong physical problem.

    This is what you are doing with your calculation procedure. I will try in my next post to explain with a more relevant physical model why what you are doing is conceptually invalid.

    • I’m not going to go through your numbers, kribaez, but your rafter story makes the wrong analogy.

      In your story, the error is known and correctable. If a futures projection, the first is not known and the second is not possible.

      A better analogy would be if your contractor had a set of different runs of rafters cut in different lumber yards. Each run is cut to a (+/-)5 mm specification, but each run of rafters has some specific error of unknown magnitude.

      The mean of errors is not known to be zero, either within runs or between them.

      Your contractor gets rafters of the different runs all mixed together, not knowing which rafter is from which run and not knowing the specific errors.

      He calculates an uncertainty in combined final rafter length if he proceeds under those conditions. How does he do it?

      • Pat,
        “How does he do it?”
        Well, yes, he will have to calculate the uncertainty in total length as an integrating series. A classic unit root problem in the underlying statistical model.
        But rafters are normally laid spaced out side-by-side at fixed intervals. I was specifically trying to distinguish between an integrating series problem in future (spatial) prediction and a problem where all marked points were based on already accumulated spatial error. Two different problems on the same physical system, but with quite different outcomes in terms of uncertainty calculation for the location of the final rafter.
        (It is also a useful lesson for all DIY enthusiasts if they are trying to put screwholes in walls at equal intervals!)

  83. Dr Frank,
    “In the emulation equation of the paper, GHG forcing goes in and successfully emulated GCM air temperature projections come out. Just as they do in GCMs. In every case, GCM and emulation, air temperature is a linear extrapolation of GHG forcing.”
    I would like here to discuss some of the inadequacies of your model for what you are trying to do, and to try to demonstrate why there is no basis for treating a calibration error in LW cloud forcing as an integrated time series error in Forcing.

    There is no doubt that the majority of GCMs can be faithfully emulated at the aggregate level as Linear Time Invariant systems.
    They adhere well to the conservation equation:-
    Net flux = Forcing – Restorative Flux
    The simplest LTI that can be fitted to GCM results is the single-body constant linear feedback equation. For a fixed step forcing F, in Watts/m2, this can be written as: –
    Net flux = CdT/dt = F – λT (1)
    Where:
    Net flux is the difference between incoming and outgoing flux (positive downward by convention)
    λ is the total feedback ( always >0) Watts/m2/K
    T is the change in temperature from initial
    C = heat capacity of the system in Watt-years/m2/K
    and t = time in years
    Since this model is linear in T, then it can be solved for an arbitrary forcing series by means of convolution or superposition. I will use the latter.
    The solution of the above equation for a fixed step forcing is given by
    T = (F/ λ) *(1 – exp(-t/ τ)) where τ = C/ λ (2)
    From the discretized superposition equation, for time increments of Δt, we can obtain the solution in recursive form for the nth timestep as:
    Tn = F(tn)/ λ *(1 – exp(- Δ t/ τ)) + Tn-1 * exp(- Δ t/ τ) (3a)
    Net flux at time tn = F(tn) – λTn (3b)
    Energy gain of the system = CT (3c)
    Where Tn is the temperature gain from t = 0 to the nth timestep, and F(tn) = the cumulative forcing to the nth timestep.
    I do not commend this LTI model as the BEST emulator of GCMs. I do commend it as a much better emulator than your model, and I will try to highlight the three main reasons.
    (i) Your emulation model is not actually a single emulation model for each GCM. Your best fitted parameter values are dependent on the frequency content of the input forcing series. This is why your parameter values change (for the same GCM) when you go from scenario to scenario. This problem becomes more evident when you move away from the near monotonic increases in forcing which form the subject of your comparisons. The LTI model does not have this problem.
    (ii) Throughout your writings, you have failed to distinguish between an error in flux, an error in net flux (balance) and an error in forcing. Indeed, your emulation model encourages you to treat all flux errors as errors in forcing, since you do not have the degrees of freedom to deal with any other type of error. You have commented, correctly, that an error in flux still means that the climate energy state is incorrect. However, that translates into a propagating error in the response to a forcing, particularly feedback, not into an integrating series error in the forcing itself. The LTI model allows one to calculate net flux as an entity distinct from forcing, and hence allows some intelligent discrimination between these different types of error.
    (iii) The LTI model has the property that (a) solutions are additive and (b) the solution for a linear increase in forcing asymptotes to a straight line increase in temperature, properties that are seen in the majority of GCMs. Starting with the LTI model, it is therefore simple to show why it appears that T varies linearly with F(t) if F(t) is a near linear function of time. You cannot however get from your model to the LTI model. In other words, it offers a more general solution than does your emulator, and better reflects the aggregate calculation of the GCMs.
    So we can now run some simple tests on the LTI model.
    Firstly, the 500 years of spin-up.
    How does a (+/-) 4 W/m2 error in cloud flux propagate during the spin-up? Answer is that, in the worst case, it reflects ab initio an uncompensated error in the flux balance, which acts like an initial forcing on the system. The temperature of the system rises/falls if the flux error is positive/negative by exactly (error/lambda) deg K until the net flux is reduced to zero. The systemic error or bias in cloud flux will remain in the system, but introduces no uncertainty into the long-term net flux, which will always go to zero. The temperature change is bounded. In practice, because all forcing runs will subtract the starting temperature to calculate the incremental temperature change, the effect of this temperature change is limited. It does however change the feedback response of the system, which clearly cannot be perfect if the internal components of the energy balance are not correct. We can examine the propagating characteristics of such error in a moment.
    What happens if a stochastic uncertainty is added annually to the bias error in cloud flux? Answer is not a lot if they are all independently sampled and added to the bias. They do not represent an integrating error in propagation. I set the cloud flux to a bias error of 4 W/m2 and each year added an uncompensated error drawn from a normal distribution with mean 0 and sd of 0.3. It can be seen from the recursive formula (Eq3a) that the new temperature is dependent on the new forcing increment (which includes this error) as well as the temperature from the previous time-step (which includes all previous errors). The long-term effect is (only) a fluctuation of net flux about zero with a sd less than the sd of the cloud flux error and a compensatory fluctuation of temperature about the system’s stable steady-state value. If, on the other hand, I add each randomly drawn annual sample error to the previous value then the problem explodes very rapidly, as one would expect with an integrating time series. There is however no justification for this.
    What is worth noting is that this combination of systemic bias error and random annual error would be large visible contributors to the calibration error which you identify, but as indicated above, they have very little effect on the uncertainty of temperature projection post-spin-up, apart from feedback error.
    I will deal with the question of why the LW cloud flux forcing shows a high degree of autocorrelation and the related question of feedback error in a separate post, since this is already too long for comfort.

    • kribaez, emulation eqn. 1 only shows that GCM air temperature projections are linear extrapolations of GHG forcing. It emulates output.

      I’m not sure what you mean by GCM “aggregate level” and so will ignore that.

      You wrote, “Your emulation model is not actually a single emulation model for each GCM.

      It is an emulator for each air temperature projection of each GCM.

      You wrote, “This problem becomes more evident when you move away from the near monotonic increases in forcing which form the subject of your comparisons.

      Figures 3, 7, and 8, as well as SI Figure S4-7, show that eqn. 1 does well reproducing projected GCM air temperatures that reflect the non-linear forcing from volcanic aerosols.

      You wrote, “Throughout your writings, you have failed to distinguish between an error in flux, an error in net flux (balance) and an error in forcing.

      I am not concerned with any of those. I am concerned with the uncertainty, not error.

      My paper is concerned with uncertainty in simulated tropospheric thermal energy flux, following from the the average annual GCM calibration error statistic in simulated cloud fraction (CF). This calibration error puts an uncertainty in the total simulated tropospheric thermal energy flux. It marks a lower limit of GCM resolution.

      It’s not about error. It’s about predictive uncertainty.

      You wrote, “However, that [incorrect climate state] translates into a propagating error in the response to a forcing, particularly feedback, not into an integrating series error in the forcing itself.

      The cloud feedback modifies the forcing. However, the cloud feedback is uncertain because of the simulation error in cloud fraction (CF). This simulation error puts a continuous average annual (+/-)4W/m^2 uncertainty in the net tropospheric thermal energy flux.

      That is, the cloud feedback response is not known to sufficient accuracy to reveal the effect of GHG forcing.

      In your 3a-3c, each ΔF(i) of the ΔF(n) is never known to better than (+/-)4 W/m^2 because the simulation error in CF leads to a large uncertainty in cloud response in every single step of a futures simulation. That in turn leads to a growing uncertainty in LWCF. That uncertainty is not the same as a growing error. We do not know the behavior of error in a climate futures projection.

      After ’n’ simulation steps of a futures projection, the uncertainty in ΔFn is the rss of the uncertainty in each step.

      You wrote, “Starting with the LTI model, it is therefore simple to show why it appears that T varies linearly with F(t) if F(t) is a near linear function of time

      Eqn. 1 reproduces GCM projected ΔT when ΔF(t) is non-linear in time.

      You wrote, “How does a (+/-) 4 W/m2 error in cloud flux propagate during the spin-up? Answer is that, … The systemic error or bias in cloud flux will remain in the system, but introduces no uncertainty into the long-term net flux, which will always go to zero.

      You’re treating the uncertainty as an error again. The uncertainty does not go to zero. The error in the observable goes to zero. But it does not go to zero because the simulation is physically correct.

      The uncertainty arises because the model deploys a deficient physical theory. That means the physics is not described correctly. The climate state is not described correctly. The cloud response is not described correctly.

      The calculated flux has a large implicit uncertainty because of the cryptic error in the description of the state. You don’t know where, exactly, it is — don’t know the flux magnitude to better than (+/-)4 W/m^2.

      If the climate state is incorrectly represented, how, then, is it possible to know that the change in CF is correctly simulated, that the forcing change is correctly represented, and that the calculated temperature change is a proper representation of the forcing change?

      The physical theory is not good enough to answer any of those questions.

      The physical meaning of the calculation is obscure. One can have no confidence in the accuracy of any of the simulated changes in the climate.

      The difference in two inaccuracies, each of unknown magnitude, is not an accuracy. It’s not even an error. It’s an uncertainty given by some independently determined calibration statistic.

      You wrote, “The temperature change is bounded.

      But its uncertainty is large and hidden.

      You wrote, “In practice, because all forcing runs will subtract the starting temperature to calculate the incremental temperature change, the effect of this temperature change is limited.

      And you are subtracting a simulated T_1(+/-)u_1 from a T_2(+/-)u_2, yielding a ΔT_1,2 (+/-)sqrt(u1^2+u2^2) = (+/-)u_1,2 > u1, u2.

      See the above about subtracting occult inaccuracies.

      It’s not a question of bounded physical error, kribaez. It’s a question of how much you know about the accuracy of the result. In this case, that knowledge is (+/-)u_1,2 > u1, u2.

      You wrote, “What happens if a stochastic uncertainty is added annually to the bias error in cloud flux?

      A tendentious question, in that your condition of a _stochastic_ uncertainty determines your answer.

      You wrote, “I set the cloud flux to a bias error of 4 W/m2 …

      But the uncertainty is not a bias error. It is a (+/-) interval; one that that does not subtract away or imply a mean of 0.

      You wrote, “If, on the other hand, I add each randomly drawn annual sample error to the previous value then the problem explodes very rapidly, as one would expect with an integrating time series. There is however no justification for this.

      A GCM calibration error statistic is not a sample error. It is a characteristic of the GCMs. It is a limit on their resolution.

      It says a simulation provides no information about the impact of thermal energy flux change that is less than a lower limit of (+/-)4 W/m^2.

      No information, kribaez.

      One does not know how clouds respond to the forcing because GCMs cannot resolve the change in cloud fraction. Therefore one does not know the change in air temperature.

      One can use a physical model to calculate all sorts of detailed things. But if the details are below the model resolution, they have no physical meaning; no significance.

      In a homely example, one can use a calculator to multiply two numbers, each with one sig-fig below the decimal, and report it to nine places. The calculator will allow that. But eight of those nine will have no meaning. The resolution of the calculation is one significant figure.

      Likewise your models. Their lower limit of resolution of the effects of thermal energy flux is (+/-)4 W/m^2. The effect of any forcing or feedback that is smaller than that, is invisible to the model.

      You wrote, in conclusion, “this combination of systemic bias error and random annual error would be large visible contributors to the calibration error which you identify, but as indicated above, they have very little effect on the uncertainty of temperature projection post-spin-up, apart from feedback error.

      Rather, they have a large impact on the uncertainty of the temperature projection.

      Even in the spin-up base climate state, you do not know that the initial spin-up air temperature is a correct representation of the equilibrium climate energy-state. The cloud fraction is wrong. The tropospheric thermal energy flux is wrong. Even if the air temperature is correct due to tuning, it is correct for the wrong reasons.

      The underlying physics is not correct. The simulated equilibrium climate state is wrong.

      There is an initial uncertainty interval that cannot subtract away. Instead, the incorrect initial state is projected incorrectly further, by way of a deficient physical theory.

      It does not matter that GCM parameters have been chosen to reproduce the air temperatures of the calibration period. The spin-up climate state is wrong, the physics is wrong. and all the simulated observables have a large but hidden uncertainty.

      [LWCF is not clear: LW (Long Wave) Cloud Fraction? .mod]

      • Pat,
        Firstly, I would like to apologise for calling you Dr Frank. It was not intentional at all.

        Secondly, I would like to thank you for the long responses above. I do not underestimate the massive effort you have put into writing and defending your paper.

        Thirdly, I will re-state that I agree with your qualitative conclusion that the GCMs are unreliable, and useless for informing decision-making. My concern here is with your methodology.

        On many occasions, I have (professionally) built models of physical systems ranging from simple analytic functions to complex dynamical simulators. Uncertainty analysis for such systems is founded on the fundamental principle that if one can define the joint distribution of inputs, then equi-probable sampling of the input space will yield via the model the joint distribution of the outputs. This output space – defined via the model – is always strictly a conditional joint probability, since it does not and cannot include “model error” which arises from incorrect or incomplete specification of the model.

        The above principle forms the foundation for uncertainty analysis.

        The inverse problem, where we have a number of uncertain observations in the output space and we wish to narrow down the range of uncertainty on the input parameters which form the joint distribution of inputs, is still governed by the same principle. Typically this is done by frequentist inverse transform, a Bayesian method or brute-force filtering.

        It is important to note that a “resolution problem” in the prediction of a key output or a “calibration problem in an input variable” is not hidden from view anywhere in the above foundational principle. On the contrary, sampling from the input distribution should reveal the output uncertainty – sufficient sometimes to justify additional data collection or scrapping the model. You seem to be denying this.

        Even with the simplest model, there should be no esoteric existence of an uncertainty which is invisible to sampling via the model. If I consider a linear model of the form Y = bX. Sampling of an input distribution of b, will yield the correct uncertainty for Y at some value of X. Equally, if X happens to be a time variable, you can calculate the uncertainty in Y as sqrt(X^2 x var(b)). Alternatively, I can propagate an error of sqrt (var(b)) summed over unit timesteps. I will obtain the same answer for all three approaches. There is no hidden, esoteric uncertainty anywhere in this. If, instead, I have a measurement uncertainty in X of (+/-) 2, I can sample from a U(-2,+2) distribution and obtain the uncertainty of Y as being in the range [-2b, +2b]. If I have uncertainy in b AND a measurement uncertainty in Y, I can still sample the distribution of b and the error in X to correctly obtain the uncertainty in Y. Once again, there is no hidden esoteric uncertainty in any of this which is not revealed by appropriate sampling or indeed by correctly applied quadrature. There is however a fundamental dependence on the mathematical or the physical model which is under evaluation.

        With large models, more often than not, many methods of dealing with this problem are impractical, and it is not unusual at all to find that engineers will test uncertainty by sampling methods applied to reduced models or emulators, supported by sensitivity tests applied to the main model. For such results to have any meaning, it is of critical importance that the reduced model is adequate to represent the key functional relationships honoured in the main model.

        I have tried to point out above that your emulation model is inadequate to the task you are setting it here, because of its inability to distinguish between a component of flux, the net flux (balance) and a forcing. In the main model here (a GCM), these represent three different variables, with different magnitudes, and distinct properties. An uncertainty in a component of flux does not translate into an error in net flux at the end of the spin-up period. The mathematics of the problem will force the net flux into balance (actually with small fluctuations around zero), and the uncertainty in net flux is close to zero. THIS DOES NOT LEAVE AN INVISIBLE UNCERTAINTY IN NET FLUX, and nor does it represent any confusion between error and uncertainty. It is something which is forced by the mathematics of the problem.

        The forcings which are subsequently applied are then by definition imposed exogenous changes to the near-zero net flux.

        You correctly state that the spin-up leaves a condition where “the spin-up climate state is wrong” or “the simulated equilibrium state is wrong”. Yes, it does, without a doubt. Can we estimate what this does to the uncertainty in temperature projection arising uniquely from the uncertainty in cloud fraction? Only with great difficulty. We can obtain some approximation to it, but it requires a different type of analysis from the one you have carried out when comparing observed to modeled cloud fraction.

        The uncertainty in temperature projection arising from the uncertainty in cloud characterisation is almost entirely associated with the uncertainty in the FEEDBACK to net flux, and you can quite legitimately argue that there does exist in this feedback system what you would describe as “a linear propagation of uncertainty”, but (a) it is NOT a linear propagation in time, it is closer to a linear propagation of flux uncertainty with temperature – which makes an enormous difference, and (b) the resulting integrated series (in flux) does not translate the entire calibration error in cloud flux into the feedback error in net flux, since systemic bias in and of itself has little effect on the net flux. What is left, in essence, is a gradient error i.e. the rate of change of feedback flux with respect to temperature.

        You will find that this flux feedback error from clouds on its own, when translated reasonably into temperature uncertainty, is more than sufficient to challenge the ability of climate models to project temperatures in any meaningful way. If you had done this, I would applaud your paper. As it is, I believe it is very poorly founded, and I take no pleasure in stating this.

        • Note 8: mathematics is needed to explore meanings, understanding, and assumptions

          I had quit this discussion for two or three days because I was very impressed with what kribaez was writing and was happy to watch. Clearly though, Pat is not. One of the problems is that after using a bit of mathematics, kribaez has moved to a substantial amount of words, which are very meaningful to him, but only somewhat meaningful to me. And then Pat can start arguing with or misunderstanding those words, and then resort to saying that only he understands the difference between error and uncertainty.

          I feel it would be valuable to use some simple mathematical models to investigate the crux of Pat’s method and whether it has any predictive value that could feasibly be tested. There are currently 39 uses of the term “random walk” upthread, and the argument was made that Pat’s bounds are not produced by a random walk. I am happy to accept that, but I am more interested in the question “do Pat’s bounds constrain random walks in GCMs and if so can the bounds be tested by running the GCMs many times?” Someone is bound to say “no, there you are, confusing error with uncertainty again”.

          Here at least are some questions for Pat which, if he graciously answers them, could get us started.

          First, are the widths of the bounds at 2100 (say) pretty independent of the new forcing represented by the Delta-F_i’s? So, if all those Delta’s were zero, would we still get wide bounds? I believe the answer is yes, in which case we can explore the concepts without new CO2 forcing.

          Second, if the emulation model was fitted to one particular GCM, rather than the given ensemble, could uncertainty bounds still be derived? I believe the answer is yes, and if so then it means we can explore the concepts without considering inter-model variability.

          If those two answers are yes, then consider the following.

          M(t) = a M(t-1) + B(t;m,s) (Equation *)

          Would Pat’s emulator be of this form with a = 1 and B(t;m,s) a random variable of mean m (probably zero) and variance s^2? If so, would s^2 depend on something like the +/-4 W/m^2 TCF error during the calibration period?

          If so, after T years, is the 1-sigma uncertainty estimated as sqrt(Ts^2)?

          With simplified mathematics like this we might be able to come to a common understanding on what the paper says, and on what assumptions it has to make to get there. I hope that would help others as well as me.

          • ” One of the problems is that after using a bit of mathematics, kribaez has moved to a substantial amount of words, which are very meaningful to him, but only somewhat meaningful to me.”

            Yes, communication is very difficult, especially between people!

            I can perhaps address some of your questions specifically.
            “do Pat’s bounds constrain random walks in GCMs and if so can the bounds be tested by running the GCMs many times?” Pat’s time series in Forcing is not strictly a random walk but it is close. Statisticians would describe his timeseries in forcing as “a (nonstationary) integrating series of order one”. Variance is accumulated annually like a random walk, so the latter part of your post is essentially correct i.e. s^2 is derived from the +/- 4W/m2 TCF error and the 1 – sigma uncertainty in FLUX is estimated as sqrt(T x s^2). The second part of your question is what a lot of my “only somewhat meaningful” post was seeking to address. I was trying to emphasise that the type of resolution problem treated by Pat here is not invisible to Monte Carlo sampling. So, yes, IN THEORY, it is possible to test Pat’s uncertainty propagation for a single GCM by running the GCM many times, each time sampling a different input value of total cloud fraction (or LWCF) at the start of the spin-up period, and then capturing the distribution of the projected temperatures at some point in time. The envelope could then be compared with Pat’s projected uncertainty envelope and should be comparable if he is correct. In practice, however, this is impossible to do because each run would take many months. Ultimately, therefore a sampling approach can only be carried out on an emulator of some sort.

            “First, are the widths of the bounds at 2100 (say) pretty independent of the new forcing represented by the Delta-F_i’s?” In Pat’s model, yes, the flux bounds are the same, though the temperature bounds will vary with the gradient parameter selected in Pat’s emulator.
            “So, if all those Delta’s were zero, would we still get wide bounds?” In Pat’s model, yes. That is why I looked at sampling a U(-4,+4) distribution of cloud flux error on my primitive LTI model to examine the impact on uncertainty in net flux and temperature at the end of the spin-up period.

            I hope this helps.

          • Kribaez,

            “So, yes, IN THEORY, it is possible to test Pat’s uncertainty propagation for a single GCM by running the GCM many times, each time sampling a different input value of total cloud fraction (or LWCF) at the start of the spin-up period, and then capturing the distribution of the projected temperatures at some point in time.”

            I don’t think I can agree with this. Uncertainty is not a random variable in and of itself and so it doesn’t cancel over many runs or over time. Merely inputting different values at the start of a run doesn’t capture the fact that uncertainty grows over time. If the assumption that CGM’s react in the same manner each time a run is made then a Monte Carlo analysis doesn’t tell you much about the uncertainty growth over time. It will give you an indication of the relationship between an error in the input and the resultant output of the CGM but that is not a measure of uncertainty. Uncertainty has to do with not knowing the output for sure even with a single input. If the CGM output is deterministic, i.e. reacts the same way each time for the same input, then where does the uncertainty of the output come into play? Uncertainty has to do with the output not being deterministic even with the same input each time.

          • kribaez, thank you, that is helpful, especially if Pat concurs. Although it might be a lot of modelling work, perhaps runs covering as few as 4 years would suffice. Those would give 1-sigma uncertainty bounds in the forcing domain of 4*sqrt(4) = +/- 8 W/m^2, which represents quite a large temperature uncertainty, and I hope that would be detectable.

            Here is a follow up question. Have model runs already been done which might answer this question? In particular, what is the exact statistical description for the error bounds recorded in Panel A of Figure 6? I previously summarized Figure 6 as follows:

            “ September 26, 2019 at 2:03 am
            Pat, thanks, so the 1-sigma uncertainty bars in Panel B do relate one-to-one to the bars in Panel A, but they are calculated differently. Now I think the bars in Panel A can be used to predict the spread which would occur if new model runs were made. But your point in Panel B is that those runs could have strayed ever so far from reality because of the physical uncertainty in the parameters of those models. Is that a fair summary?”

            This was never answered. It is important because it relates to whether Pat’s bounds do relate to the possible universe of model runs, which we have assumed above.

          • Reply to Tim Oct1 5:50am

            Tim, your spiel is an example of why I think it is important to argue with mathematics. I understand, though, that some people find it hard to do that. I shall try here to put your words into mathematics and examine the consequences. In my Equation (*) from earlier I am going to take a=1 as read (for now), and use ‘z’ in place of ‘m’ because there are too many m’s; ‘z’ stands for zero, which is the value it might be expected to be. So we have:

            M(t) = M(t-1) + B(t;z,s) (*)

            Here are 5 points you made, with my riposte.

            1. “Uncertainty is not a random variable in and of itself and so it doesn’t cancel over many runs or over time.”

            As pointed out before, Pat treats uncertainties as random variables when he takes root mean square. So if u_1 and u_2 are “uncertainties”, real numbers, in combination he uses U(u1,u2) = sqrt(u1^2+u2^2). This only makes sense if u1 is considered to be a standard deviation to a random variable V1, and u2 likewise for an independent r.v V2. And this U expression does encapsulate a modest amount of cancellation, wherein V1 might be +0.303u1 and V2 might be -1.007u2. This is the case of 2 independent rulers being used, once each.

            2. “Merely inputting different values at the start of a run doesn’t capture the fact that uncertainty grows over time.”

            Why not? Let there be n input values m_1(0),…,m_n(0). Then M_i(1) = m_i(0) + B_i(1;z,s) has mean m_i(0)+z and variance s^2. If we observe the values of M_i(1), call them m_i(1), then a “good” estimate of z is

            z* = sum_{i=1}^n (m_i(1)-m_i(0))/n

            and a “good” estimate of s^2 is

            s* = sum_{i=1}^n (m_i(1)-z*)/(n-1)

            So, we had no clue how the model was going to evolve over one time step, but after our n trials we do have a clue. Alternatively, we did have a clue, by some theory like Pat’s, that s would be a known function, say s’, of +/- 4 W/m^2. Well, now we can start asking whether s* corroborates that inferred value s’. This is exactly what I am after. As for growing over time, after t steps the 1-sigma uncertainty grows to (s*)sqrt(t), under independence assumptions.

            3. “It will give you an indication of the relationship between an error in the input and the resultant output of the GCM but that is not a measure of uncertainty.”

            But m_1(0) was a given, fixed, input, and it can have no error other than in relation to reality which I’ll call R(0), when the error is m_1(0)-R(0). But does Pat’s theory of uncertainty relate to the difference between models and reality at time t, or to the plausible spread of model outputs at time t? I thought it was the latter, in which case input error is meaningless.

            4. “Uncertainty has to do with not knowing the output for sure even with a single input.”

            True, even though we know m_i(0), m_i(1) is an observation of the r.v. M_i(1) = m_i(0) + B_i(1;z,s).

            5. “If the GCM output is deterministic, i.e. reacts the same way each time for the same input, then where does the uncertainty of the output come into play?”

            That’s a reasonable question; I wonder if kribaez can answer: do GCM’s use pseudo-random numbers to randomize their runs?

          • Rich,

            “As pointed out before, Pat treats uncertainties as random variables when he takes root mean square. So if u_1 and u_2 are “uncertainties”, real numbers, in combination he uses U(u1,u2) = sqrt(u1^2+u2^2). ”

            When you square the uncertainty values they will always add, never subtract. Then when you take the square root you always get an interval, plus and minus, that was larger than it was. Thus the uncertainty grows. It never cancels like a random variable would.

            “This only makes sense if u1 is considered to be a standard deviation to a random variable V1”

            Why? The uncertainty is not a random variable. It is an uncertainty in the output. Doing the square simply eliminates the +/-. You then take the square root to get back to a +/- interval. Suppose only u1 has a value that is non-zero. The process simply gives you back what u1 is. It doesn’t convert it to a standard deviation of a random variable. Suppose u1 and u2 are both the same. You wind up with 1.414 * u1 as the new uncertainty interval. If you have 3 successive uncertainties you get 1.73 * u1. The uncertainty grows with each interval. Only if you do something that lessens the uncertainty mid-run will you see the uncertainty go down, e.g. u4 = .9 * u1. I don’t believe any of the CGM’s can do anything mid-run to lessen the uncertainty.

            “Why not? Let there be n input values m_1(0),…,m_n(0). Then M_i(1) = m_i(0) + B_i(1;z,s) has mean m_i(0)+z and variance s^2. If we observe the values of M_i(1), call them m_i(1), then a “good” estimate of z is”

            You are, once again, assuming uncertainty is a random variable with a mean and a variance. It isn’t. The uncertainty starts at the beginning of the run and unless something happens mid-run to change that value of uncertainty, it never changes, e.g. the +/- 4Wm^2. All that changes is what the total uncertainty becomes, e.g. the root mean square. It always grows, it never decreases.

            “So, we had no clue how the model was going to evolve over one time step, but after our n trials we do have a clue.”

            No, you don’t. Again, run after run, the CGM is going to give a deterministic output with an uncertainty that grows with each iteration. If with an input of v_1 you get an output of o_1 with an uncertainty of u_1 (an accumulation of error over how ever many iterations you make) you haven’t changed the amount of uncertainty in the output at all. Change the input to v_2 with an output of o_2 (a second deterministic output of the model) then you *still* have an output uncertainty of u_1.

            You can do however Monte Carlo runs you want to make and you will never lessen the uncertainty associated with the output.

            “But m_1(0) was a given, fixed, input, and it can have no error other than in relation to reality which I’ll call R(0), when the error is m_1(0)-R(0). But does Pat’s theory of uncertainty relate to the difference between models and reality at time t, or to the plausible spread of model outputs at time t? I thought it was the latter, in which case input error is meaningless.”

            You are making this more complicated than it needs to be. If you have inputs v_1 and v_2 with outputs o_1 and o_2 they are both still subject to the uncertainty associated with the model output. You may get a feel for what an error equal to v1-v2 causes in the output but both are *still* uncertain outputs. In fact, unless v_1 – v2 is greater than the uncertainty how do you even know what the sign of the difference is?

            “True, even though we know m_i(0), m_i(1) is an observation of the r.v. M_i(1) = m_i(0) + B_i(1;z,s).”

            How do you know this? Unless B_1(,s) is greater than the uncertainty, how do you even know for sure what M_i(1) is?

          • Tim,
            What I am saying is genuinely fundamental. Vasquez and Whiting, whom Pat is citing, make use of Monte Carlo methodology in their paper(s) to estimate uncertainty arising from (both) systematic bias in calibration and random error. Here is a quote from Vasquez, Whiting and Meerschaert 2010:-
            “To analyze random and systematic error effects using Monte
            Carlo simulation, the approach proposed by Vasquez and Whiting
            (1999) is used, which consists of, first, defining appropriate probability distributions for the random and systematic errors based
            on evidence found from different data sources. Then bias limits
            are defined for the systematic errors of the input variables of the
            model. Samples are drawn using an appropriate probability distribution
            for the systematic errors if a priori information is available.
            Otherwise, a uniform distribution is used. For the random errors,
            samples are taken from each of the probability distributions characterizing
            the random error component of the input variables and
            then the samples are passed through the computer model.”

            There is no output uncertainty that is invisible to this approach.
            And uncertainty in an output variable does not always increase with time. This depends on the problem being solved.

          • “here is no output uncertainty that is invisible to this approach.”

            Of course there is.

            “And uncertainty in an output variable does not always increase with time. This depends on the problem being solved.”

            Give me an example.

            “Samples are drawn using an appropriate probability distribution for the systematic errors”

            The uncertainty Pat used is a constant obtained this way: “We know from Lauer and Hamilton, 2013 that the annual average ±12.1% error in CMIP5 simulated cloud fraction (CF) produces an annual average ±4 W/m^2 error in long wave cloud forcing (LWCF).”

            So what the uncertainty becomes over “n” iterations is the sqrt of {[(n) *(n+1)/2 ] * 16}. Do this over 80 years and you get an uncertainty of about +/- 220Wm^2 total uncertainty (I’m doing this in my head so forgive my inaccuracy). That is basically Pat’s Equation 6 in his paper. That’s enough to swamp the ability to tell what is going to happen in 80 years with the resolution required to distinguish a few degrees of warming.

          • [U]ncertainty in an output variable does not always increase with time. This depends on the problem being solved.

            Indeed! By mistaking the 20-yr rms error estimate of 4 W/m^2 in modeling the difference between all-sky and clear sky LW emissions (so-called LCF) reported by Lauer and Hamilton as a characterization of a YEARLY increment in a CUMULATIVE sum of variances, Frank unwittingly adopts a totally wrong conception of error propagation in modeling time-series.

            In the recursion relationship that represents a developing Markov chain

            y(n) = a y(n-1) + b x(n)

            a cumulative sum of stochastically independent, gaussian inputs x is obtained with a = b = 1. But then the output y is inherently unstable, wandering off into an ever-widening 1-D random walk. There simply is no evidence that calibrated GCMs behave that way, as Spencer demonstrated convincingly for the pre-industrial control runs in Figure 1 of his column of Sept.12.

            Instead of the well-known Wiener process, GCM output and its error is far more closely represented by the Ornstein-Uhlenbeck process, with a < 1. This introduces a dynamically characteristic exponential decay to the effect of input at any time and produces autocorrelated “red” noise when the input is gaussian white noise. Such a system always satisfies the bounded input/bouded output (BIBO) stability criterion. A semblance of such a spectral characteristic is often found in actual geophysical time series. See: https://journals.ametsoc.org/doi/pdf/10.1175/1520-

          • sky:
            From the document in your link:

            “Consider a discrete time series which depends only on its own immediate past value plus a random component.”

            The annual mean of the overall discrepancy of +/- 4Wm^2 is not a “random component”. You can argue about what that mean value should be on an annual basis but it will be difficult to argue that there is no mean (average) value.

            Once again, uncertainty is an interval, not a random variable. It only indicates the interval in which you might expect to find the variable but it doesn’t tell you where in that interval it is. Therefore it simply can’t cause a random walk. That uncertainty interval doesn’t take on a random value on each iteration. It isn’t plus one time and negative the next. It’s like describing the eccentric orbit of a satellite. You can certainly calculate the average distance of that satellite and say it is x miles +/- the eccentricity value. That eccentricity value doesn’t change randomly from year to year (theoretically). And it *is* an uncertainty value applied to the average distance of the satellite. If there were a small impetus being applied continuously over time to that satellite to cause it to expand its orbit (i.e. a “forcing”) then the uncertainty interval on the eccentricity would also grow over time. It wouldn’t cause a “random walk” in any way, shape, or form.

          • The annual mean of the overall discrepancy of +/- 4Wm^2 is not a “random component”.

            A clear reading of everything I wrote here shows that 4Wm^2 is consistently treated as the rms value of the model-average LCF error, i.e. a 20-yr sample estimate of the variability of that random error that specifies a FIXED uncertainty. Pavlovian persistence in claiming otherwise is just throwing more peanut shells on a blast of hot air.

          • sky:

            “A clear reading of everything I wrote here shows that 4Wm^2 is consistently treated as the rms value of the model-average LCF error, i.e. a 20-yr sample estimate of the variability of that random error that specifies a FIXED uncertainty. Pavlovian persistence in claiming otherwise is just throwing more peanut shells on a blast of hot air.”

            Maybe we are talking past each other. Pat is using +/- 4Wm^2 as an interval, not just a positive value of 4Wm^2. That is what taking a square root of an interval causes to happen. You still wind up with an interval.

            That fixed uncertainty is an annual value the way Pat has done it. As an annual value it has to be combined over however many iterations are made.

          • “4Wm^2 is consistently treated as the rms value of the model-average LCF error”
            Plus it is rms of the error at individual grid points. Yet Eq 1 treats it as the rms error of a global average.

          • From Nick Stokes: “Plus it is rms of the error at individual grid points.

            From Lauer and Hamilton, page 3833: “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means. …

            In both CMIP3 and CMIP5, the large intermodal spread and biases in CA and LWP contrast strikingly with a much smaller spread and better agreement of global average SCF and LCF with observations …

            FIG. 7. Biases in simulated 20-yr-mean LWP from (left) the (top to bottom) four individual coupled CMIP5 models and (middle) their AMIP counterparts, with the smallest global average rmse in LWP. (my bold)”

            Wrong again, Nick

          • See (and Tim),

            “I wonder if kribaez can answer: do GCM’s use pseudo-random numbers to randomize their runs?” No. The model runs are deterministic, which means that if you repeat a run with identical initial conditions and inputs, then you will obtain the same result. Typically, however, the results that you will normally see represent an average of multiple runs (at least 5) for a given forcing scenario on the same model. Each of these runs is “kicked-off” from a different time-point from the spin-up saved results, and therefore each has slightly different initial conditions; each run then yields a slightly different outcome. The fluctuations visible in GCM runs come from the chaotic character of the governing mathematics and not from any input stochastic variation.

            None of the above affects what I am saying about the validity of using Monte Carlo (MC) methodology to test for Pat’s uncertainty. If you want a truly simple, but very relevant example, I would suggest that you return to my post above (https://wattsupwiththat.com/2019/09/19/emulation-4-w-m-long-wave-cloud-forcing-error-and-meaning/#comment-2809839) where I discuss the linear relationship Y = bX. It is relevant because Pat has convinced himself that the temperature change from initial can be adequately emulated as a simple linear function of cumulative forcing F(t).

            Let me fill in some blanks on this example. We are interested in the uncertainty in Y arising from a calibration problem in b, when X reaches 100. Our best measurements tell us (no more than) that b sits in an interval between 1 and 3, say. We decide to treat b as a random variable with a Uniform distribution given by U(1,3). We can readily calculate that the maximum possible value of Y is then 300 and the minimum value is 100. The range (spread) of possible values is therefore 200, and it happens to be uniformly distributed in this case, so we can say that when X is 100, Y is a RV with a pdf of U(100, 300). We do not know the error in Y, since we do not know the true value of b, but we do know its uncertainty, a point which Pat has made repeatedly.

            Alternatively, we can calculate the variance of b and use standard quadrature to calculate the variance in Y at any value of X. Since b is uniformly distributed, its variance is given by (range)^2/12 = 2^2/12, which equals 1/3 in this instance. Its sd is then equal to sqrt(1/3). When X is 100, Var(Y) = 100^2* Var(b) = 100^2*1/3. Since Y is uniform, we can convert this back to a range: range = sqrt (12 * 100^2 * 1/3) = 200. We have the same answer as before.

            The third alternative is to run a Monte Carlo which samples b as a RV with distribution U(1,3), and for each realisation, calculates Y = 100b. The resulting values of Y will all sit between 100 and 300 and will conform to a U(100, 300) distribution.

            Tim, note
            (a) that the uncertainty calculation here in this first case yields an uncertainty which will increase with increasing values of X
            (b) what we have calculated here is a “resolution uncertainty” in Y
            and (c) the Monte Carlo approach is perfectly capable of revealing that uncertainty.

            Two other important things to note are firstly that the above example deals with a calibration uncertainty in the parameter value, b, and not the variable value, X, and, secondly, that the error or standard deviation of Y varies with X, and not with sqrt(X) for this particular linear problem.

            Pat is concerned with resolution uncertainty in the variable, X, rather than the parameter, b, so now let’s consider the case where b is an accurately known value, and X carries uncertainty. If X can only be measured to an accuracy of (+/-)2, then the resulting uncertainty in Y will be (+/-)2b for all values of X. Again this can be confirmed by MC, although it is clearly not necessary in this case. There is no growth of uncertainty for this type of error, and no hidden resolution uncertainty in either of the two cases.
            Now let’s turn to Pat’s specific claim. He argues that the the total temperature change, T(t), from the start of a forcing run varies linearly with cumulative forcing, F(t).

            T(t) = bF(t) where b is a constant, assumed to be known in Pat’s model.
            If we index the time into annual times, t1, t2, t3…ti…, we can write :-
            T(ti) = bF(ti)
            T(ti+1) = bF(ti+1)

            Hence T(ti+1) – T(ti) = ΔTi+1 = b(F(ti+1)-F(ti)) = bΔF(ti+1)
            T(ti+1) = b(F(ti) + ΔF(ti+1)) EQ 1

            In my Y = bX example above, the calibration uncertainty in my variable, X, was ever-present, but it was applied to the total value of X, so it did not grow as X increased.
            Pat, however, argues that his LWCF calibration error of (+/-)4 watts/m2 should be added to each annual INCREMENT of forcing, so that
            ΔF(ti+1) becomes [ΔF(ti+1)+ error], where the error is drawn from a U(-4,4) distribution. The result is variance growth as per a random walk.

            The problem is that there is no physical or theoretical justification for this type of uncertainty propagation in the GCM – or in a sophisticated emulator of the GCM. If there were, then yes it should in theory be revealed by MC testing on single GCMs. There IS a physical and a theoretical justification for a different type of uncertainty propagation from cloud characterisation, and that is via the feedback flux, and as I have already stated in a previous post, that yields a considerable resolution uncertainty in temperature projection, but it is propagated through temperature change rather than time and does not have the same theoretical form as Pat’s uncertainty growth. Pat’s emulator however is far too primitive to assess the effect of uncertainty in the feedback flux, since it has no ability to discriminate between a flux component, the net flux and a forcing – all of which are quite different variables in their magnitude and effect. To the extent that a feedback error would be visible at all in Pat’s model, it would affect his gradient term (parameter b in the above) rather than total forcing. Pat (instead) is trying to add a calibration error in a flux component (LWCF) to a forcing series, which is an exogenous deterministic input, but which he is treating in his emulator as a proxy for a net flux change.

          • Response to Tim Gorman,

            ““And uncertainty in an output variable does not always increase with time. This depends on the problem being solved.”

            Give me an example.”

            Sure. Drop a tennis ball with a coefficient of restitution of 0.3 onto the ground from a height estimated to be 6m (+/-)2. What is the uncertainty in the elevation of the tennis ball above the tennis court after 10 minutes?

            Alternatively, have a look at the primitive LTI model above. You initialise it with an uncertainty of (+/-)4 in the net flux (balance). What is the uncertainty in net flux after 500 years?

          • 1sky1, “ a 20-yr sample estimate of the variability of that random error that specifies a FIXED uncertainty.

            The pair-wise correlation of TCF error shows that it is not random (Table 1, paper page 7).

            The calibration uncertainty is a characteristic of CMIP5 GCMS. It is a product of deficient theory. It shows up in every step. The uncertainty it puts into a projection must increase with every step.

          • Pat,
            “These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means”
            An odd thing for you to highlight. He is emphasising that he is calculating correlation of spatial variability, not variability of a spatial average over time. And it is from that correlation that the rmse of 4 Wm⁻² is calculated.

            “global average rmse in LWP. (my bold)”
            That does not mean the rmse of the global average. It means, as it says, the global average of rmse.

          • The pair-wise correlation of TCF error shows that it is not random (/blockquote)

            Spatial-lag autocorrelation of cloud-cover (sic!) error merely indicates that it doesn’t vary spatially like white noise. This doesn’t preclude red-noise random variation or various sporadic deficiencies in modeling. In any event, spatial variability is not the issue in the time-propagation of modeling error. As Lauer and Hamilton make clear, their LCF-error determinations “give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means.”

            In other words, there’s a model time series of annual means of spatial variability that is statistically characterized by the FIXED cross-correlation and rms error over the prescribed 20-yr period of satellite observations. That 4 W/m^2 rms error of the aggregate model mean CANNOT legitimately be denominated PER ANNUM, let alone be compounded. Nor does it represent the average standard deviation of 20 individual years, as Frank has claimed in shocking violation of the algebraic law that square roots are NOT distributive. Moreover, the correlation of 0.93 with observations doesn’t point to any truly gross modeling deficiencies. While such do exist, they are not at all those ostensibly discovered here.

          • I wrote, “The pair-wise correlation of TCF error shows that it is not random ”

            To which you (1sky1) replied, “Spatial-lag autocorrelation of cloud-cover (sic!) error merely indicates that it doesn’t vary spatially like white noise.

            I directed you to the pair-wise TCF error correlations in Table 1, sky.

            You quoted Lauer and Hamilton as, “annual means of spatial variability ” and followed that up with , “aggregate model mean CANNOT legitimately be denominated PER ANNUM

            So, for you an annual mean is not per year. That is, (sum of magnitudes)/(number of years) is not magnitude/year.

            Summarizing, you’re claiming that an annual mean is not an annual mean.

            You wrote, “…Frank has claimed in shocking violation of the algebraic law that square roots are NOT distributive.

            So you’re denying the validity of paper eqns. 3 & 4 and denying propagation of error as (+/-)sqrt(sum of variances).

            You wrote, “Moreover, the correlation of 0.93 with observations doesn’t point to any truly gross modeling deficiencies..

            That’s actually funny.

            Two linear series, one of 0-1 in 0.01 steps and the other with steps of 0-100 has correlation 1.00 And yet the final values differ by 99.

            That 99 could be error. The point is that correlation does not prescribe magnitude. Correlation of 0.93 does not tell us anything about the size of the error.

            You wrote, “While such do exist, they are not at all those ostensibly discovered here.

            I did not discover any modeling deficiencies. The deficiencies were discovered and reported by Lauer and Hamilton. I just derived one of the consequences of those deficiencies.

          • Nick Stokes, “ … 20-yr annual means of total spatial variability .. is not variability of a spatial average over time.

            Sure Nick. The variability of the error of each model is not the variability of the average error. Great point.

            Meanwhile, “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means. … (my bold)”

          • Pat,
            “That is, (sum of magnitudes)/(number of years) is not magnitude/year.
            Summarizing, you’re claiming that an annual mean is not an annual mean.”

            (sum of magnitudes)/(number of years) is not any kind of mean, since the sum is over n, which is not the number of years. High school kids know that. It is the elementary error in S6.2.

            “Meanwhile,”
            The paper is clear enough. But in any case Lauer himself said your interpretation is nonsense (via Brown):
            “I have contacted Axel Lauer of the cited paper (Lauer and Hamilton, 2013) to make sure I am correct on this point and he told me via email that “The RMSE we calculated for the multi-model mean longwave cloud forcing in our 2013 paper is the RMSE of the average *geographical* pattern. This has nothing to do with an error estimate for the global mean value on a particular time scale.”.”

        • kribaez, you wrote, “My concern here is with your methodology.

          Have you seen the post from nick, above? He’s a physicist, apparently thoroughly familiar with uncertainty analysis.

          You wrote, “equi-probable sampling of the input space will yield via the model the joint distribution of the outputs. This output space – defined via the model – … The above principle forms the foundation for uncertainty analysis.

          An uncertainty equivalent to precision, kribaez, as you are assessing variability of model response, not accuracy versus known standards.

          You wrote, “On the contrary, sampling from the input distribution should reveal the output uncertainty – sufficient sometimes to justify additional data collection or scrapping the model. You seem to be denying this.

          As I understand your point, you’re proposing that sampling the uncertainty range in inputs establishes the uncertainty in outputs. I don’t deny that. I merely recognize it is an estimate of model precision. Not of model accuracy.

          You wrote, “If I consider a linear model of the form Y = bX. Sampling of an input distribution of b, will yield the correct uncertainty for Y at some value of X

          Yes, but it will not reveal the distance from the physically correct value of Y.

          If you “have a measurement uncertainty in X of (+/-) 2,” and X enters into a sequential series of calculations to predict the behavior of Y in a sequential series of ‘n’ future states, then the (+/-)2 of X gets propagated through that series as the rss, sqrt[sum over n*(4)], so that the uncertainty in Y grows.

          Your approach is fine for engineering models, kribaez, where parameters are calibrated to reproduce observables within some calibration bound. You must know well that calibrated engineering models do not reliably predict observables beyond those bounds.

          However, prediction beuyond their calibration bounds is exactly what is being asked for GCMs, which are engineering models. Prediction of a forward state requires a different approach to uncertainty than monitoring model variability within its calibration bounds.

          You wrote, “I have tried to point out above that your emulation model is inadequate to the task you are setting it here, because of its inability to distinguish between a component of flux, the net flux (balance) and a forcing.

          Emulation eqn. 1 does not need to distinguish anything. The error term is brought in from outside that equation. The (+/-)4 Wm^2 of LWCF error from Lauer and Hamilton is clearly not a forcing. Neither is it a component of flux, nor a part of the net flux. It is a simulation uncertainty stemming from models that are in simulated flux balance.

          You wrote, “An uncertainty in a component of flux does not translate into an error in net flux at the end of the spin-up period.

          The error is in the physical theory of clouds as deployed in the model specifically, and in the over all deficiency in the physical theory in general.

          Spin-up of a physically wrong climate-state does not lead to a representation of the physically correct state. The equilibrium simulation may be stable. But the simulation state is not known to be a physically correct representation of the climate energy-state.

          An uncrtainty interval is an ignorance interval. A state of ignorance does not improve when it is projected forward into the unknown.

          You wrote, “The mathematics of the problem will force the net flux into balance (actually with small fluctuations around zero), and the uncertainty in net flux is close to zero.

          And the error with respect to the physically correct flux. What is that? You don’t know. Because the observational record is poor, and the physical theory is deficient.

          You continued, “THIS DOES NOT LEAVE AN INVISIBLE UNCERTAINTY IN NET FLUX, …

          It leaves a simulation error of unknown sign and magnitude in the total cloud fraction, which in turn produces a likewise error in long wave cloud forcing within the simulation. With respect to a simulation that is made to be in TOA balance, that uncertainty in thermal energy flux is indeed invisible.

          continuing “ and nor does it represent any confusion between error and uncertainty.

          It does, actually. None of what you wrote addresses accuracy at all.

          continuing “ It is something which is forced by the mathematics of the problem.

          When did mathematics determine the physics?

          You wrote, “We can obtain some approximation to [the uncertainty in temperature projection], but it requires a different type of analysis from the one you have carried out when comparing observed to modeled cloud fraction.

          Eqn. 1 shows that GCMs project temperature as a linear extrapolation of fractional GHG forcing, kribaez. Linear extrapolation is subject to linear propagation of error.

          The displayed sensitivity of eqn. 1 to tropospheric W/m^2 forcing is the same as the sensitivity of GCMs to tropospheric W/m^2 forcing. A coherence of sensitivity to uncertainty in forcing strictly follows.

          My analysis is clearly not how you’d have done it. But that does not make it incorrect.

          You wrote, “The uncertainty in temperature projection arising from the uncertainty in cloud characterisation is almost entirely associated with the uncertainty in the FEEDBACK to net flux,

          What you wrote, kribaez means that cloud feedback to CO2 forcing is hugely uncertain relative to the size of CO2 forcing. Neither you nor anyone else knows how clouds will (or will not) respond to CO2 forcing.

          The (+/-)4 W/m^2 LWCF uncertainty is a measure of that uncertainty in cloud response, in a form (tropospheric thermal energy flux) that is directly applicable to the forcing introduced by CO2 emissions.

          GCMs cannot resolve the effect of increased CO2 forcing on clouds. They cannot resolve tropospheric thermal energy flux to better than (+/-)4 W/m^2. They cannot resolve the effect of a 0.035 W/m^2 increase in thermal energy flux on clouds. They cannot resolve the cloud feedback in response to that 0.035 W/m^2 increase in forcing.

          The effect of CO2 forcing on the climate is invisible to GCMs. That is why the uncertainty in air temperature increases year-to-year in a futures projection.

          The ignorance about the difference between the physically correct temperature and the simulated temperature grows with every step in a projection. Growth of ignorance = growth of uncertainty.

          You continued, “… you can quite legitimately argue that there does exist in this feedback system what you would describe as “a linear propagation of uncertainty”, but (a) it is NOT a linear propagation in time, …

          The cloud feedback uncertainty is not about the physical response, kribaez. It’s about the ability of the model to simulate the response. That is a linear propagation in step-number, not a linear propagation in time. It’s about the model, not the climate.

          continuing, “ …it is closer to a linear propagation of flux uncertainty with temperature – which makes an enormous difference, and (b) the resulting integrated series (in flux) does not translate the entire calibration error in cloud flux into the feedback error in net flux, since systemic bias in and of itself has little effect on the net flux.

          The physically correct net thermal flux is not simulated to better than (+/-)4 W/m^2, kribaez. It does not matter how the model behaves, or how the simulated climate behaves. It does not matter that the TOA balance is maintained (which it always is in simulations).

          What matters is that the tropospheric energy state is wrong. Cloud behavior is wrong. The impact of CO2 forcing is invisible. Each step in the sequence of steps has a (+/-)4 W/m^2 uncertainty in tropospheric thermal energy flux. The impact of CO2 forcing (0.035 W/m^2) is lost within it.

          The physically correct behavior of air temperature in response to CO2 forcing can therefore not be known. The simulation wanders continually away from the physically correct state in the climate phase-space because clouds are incorrectly simulated. The distance between the simulation and physically correct trajectories is not known, and is known ever more poorly as the simulation proceeds.

          The growth in uncertainty reflects this growth in ignorance. It is not saying anything about how the model is behaving. It is saying everything about what we actually known.

          You wrote, “As it is, I believe it is very poorly founded, and I take no pleasure in stating this.

          You’re a good guy, kribaez, thanks. But your analysis is misconceived.

          You’re thinking of calibration uncertainty as physical error and as though uncertainty were revealed in the behavior of the model. It isn’t, and it isn’t.

          • “Words, words, I’m so sick of words…show me now!”. Thus sang Eliza Doolittle in “My Fair Lady”, though she was speaking of physical love rather than physical physics. For me, although some words are needed for explanation, I am going to keep on harping “show me the mathematics!”. In my Note 8 above I gave Equation (*), which is really important because with ‘a’ set to 1 it is I believe Pat’s Equation 1 (or 5) stripped to its bare bones, yet Pat hasn’t commented on it. Here it is again (with z for mean instead of m):

            M(t) = a M(t-1) + B(t;z,s) (*)

            I believe this equation is a good representation of Pat’s and the B distributions can be called uncertainty distributions, and they combine in a RMS fashion just as Pat asserts provided independence applies. Moreover, as kribaez would put it, these uncertainties are not “invisible to Monte Carlo sampling”, which means that statistical inferences can be made on z and s.

            Show me where that is wrong; show me now! If it is in fact correct then it is a great basis for making progress against this problem, instead of just “waving our hands” (as mathematicians tend to say) that +/-4 W/m^2 indisputably propagates RMS-wise through the model, invisibly and without means of testing.

            I’ll now throw another spanner in the works.

            So far I have considered a=1. But what if in the GCMs a<1? Then

            Var(M(t)) = a^2 Var(M(t-1)) + s^2 = a^(2t) Var(M(0)) + s^2 (1-a^(2t))/(1-a^2) (**)

            This is the case of structural damping in the model, which my “Note 7: How does a cloud in 1990 affect me now?” suggests may actually be the case.

            In this Equation (**), as t tends to infinity, Var(M(t)) tends to s^2/(1-a^2), a finite limit.

            As I said before, you can linearize the model mean, but you can’t validly linearize the model error/uncertainty without knowing much more about the structure of the model, because its evolution is highly germane to the way that its uncertainty propagates.

          • Rich,

            “M(t) = a M(t-1) + B(t;z,s) (*)

            I believe this equation is a good representation of Pat’s and the B distributions can be called uncertainty distributions, and they combine in a RMS fashion just as Pat asserts provided independence applies.”

            Your equation is wrong. B(t;z,s) doesn’t add to M. The equation should be
            M(t) = a M(t-1) +/- B(t;z,s) (*)

            It’s no different than saying: This one foot ruler = 12″ +/- 1″. The uncertainty specification doesn’t add to the 12″. It only says the ruler’s length can actually be between 11″ and 13″. It doesn’t say what the length really is so it can’t be considered as adding or subtracting anything specific.

            If you’ll look carefully Pat’s equation says +/- 4Wm^2. It doesn’t say +4Wm^2.

            “Moreover, as kribaez would put it, these uncertainties are not “invisible to Monte Carlo sampling”, which means that statistical inferences can be made on z and s.”

            I think I covered this already. If the output of the CGM is deterministic, i.e. it gives the same answer every time for the same input values, then Monte Carlo analysis can’t tell you what the uncertainty is. Every output of every run will have an uncertainty associated with it. If the uncertainty factor remains +/- 4Wm^2 for all runs then that factor will compound over each iteration of each and every run, no matter what the input values actually are. You simply cannot reduce uncertainty by making multiple runs with different input values.

            “In this Equation (**), as t tends to infinity, Var(M(t)) tends to s^2/(1-a^2), a finite limit.”

            Uncertainty is a +/- specification. It’s a constant in Pat’s analysis = +/- 4Wm^2. How does a constant have a variance and standard deviation? How does a constant become damped and go to zero?

            “As I said before, you can linearize the model mean, but you can’t validly linearize the model error/uncertainty”

            As Pat keeps saying – “uncertainty is not error, error is not uncertainty”. Pat didn’t linearize the uncertainty factor, he came up with a constant identified as an average over a number of years. You can argue that his value is incorrect, that it should be something else. But then you need to actually show where Pat’s analysis used to develop the constant went wrong. It certainly looks legitimate to me.

          • Pat,
            Once again, thank you for the detailed response.

            You are confirming, I think, that you accept that sampling from the full joint distribution of inputs when mapped via “the governing model” to outputs must yield the full output space. The only uncertainty which is invisible to this process arises from the choice or validity of the model itself. It is an important point and one which many of your supporters on this thread seem to have difficulty in accepting.

            I was entirely comfortable with you comments until you got to:-
            “If you “have a measurement uncertainty in X of (+/-) 2,” and X enters into a sequential series of calculations to predict the behavior of Y in a sequential series of ‘n’ future states, then the (+/-)2 of X gets propagated through that series as the rss, sqrt[sum over n*(4)], so that the uncertainty in Y grows.”

            In the problem as set, this is a monstrously wrong answer, and I would invite you to spend a couple of minutes thinking seriously about this, because it speaks to one of the most controversial elements in your paper. In the hypothetical problem as set, we had a model Y = bX with an accurately known value of b, and we had a calibration error in X of (+/-)2. I repeat for emphasis that the measurand for the calibration is the variable X.

            As I reported, the resulting uncertainty in Y is always (+/-)2b in this case, something which is simple to verify. Residuals are stationary. Confidence intervals are invariant with X. There is no growth of the uncertainty in Y as X varies.
            If X is varying (as well) as some (unknown) function of time, then an operator might be asked to measure X at regular time intervals, and then record the change in X since the last measurement. We can now trivially rewrite our model as
            Yi = b(Xi-1 + ΔXi ) where i is indexing the timesteps. Has the uncertainty in Y changed at all? The answer is no, it has not changed one whit, not at all. The reason is that the measurand is still X and NOT ΔX, so the new value of Xi is still (always) carrying an error bar of only (+/-2), and the uncertainty in Yi has a range of (+/-)2b. There is no integration of previous errors in the Xi values.
            To get to your answer, you have to change the question, and in particular you have to change the measurand for the calibration error. The new question is:- Suppose that YOU HAVE NO ABILTY TO CALIBRATE X, but that you have the ability to calibrate the accuracy of the measured CHANGE IN X over each recorded timestep; from this, it is determined that each ΔXi carries an error bar of (+/-)2. What is the uncertainty in Y after n timesteps? This new problem now presents a classic integrating series of order 1 with the variance in X rising after n steps to nVar( ΔXi), or , as you put it, “the (+/-)2 of X gets propagated through that series as the rss, sqrt[sum over n*(4)]”.
            The calibrated measurand in the first question is an already integrated series, and the uncertainty is stationary. The calibrated measurand in the second question is the first difference of that series in statistical jargon, and the uncertainty in its sum accumulates like a random walk. The difference is enormous. I would therefore invite you to consider very carefully whether in your calibration of LWCF, the RMSE you have found is a measure of error in the total LWCF (the first problem) or a measure of the error on the change in LWCF each year (the second problem).

            As for the rest of your response, I have read it carefully, but still believe that the limitations of your emulator are leading you to add different animals together inappropriately, but I won’t bore you by repeating the same arguments. I think that we will have to respectfully agree to disagree.

            I wish you well in any case and hope that the future feedback is not too painful.

          • “The reason is that the measurand is still X and NOT ΔX, so the new value of Xi is still (always) carrying an error bar of only (+/-2), and the uncertainty in Yi has a range of (+/-)2b.”

            I’m sure Pat will respond but it would seem you forgot that X(i-1) has already incurred an uncertainty of +/- 2. Then *another* uncertainty of +/- 2 is added with the new ΔX. You wind up with a series of uncertainty which accumulate as the root-sum-square.

            I hate to keep going back to the one foot ruler that is 12″ +/- 1″ but it is the same thing. If you measure the width of a room, i.e. Y, then you get a formula of Y=bX where b is the number of times the ruler is laid end-to-end and X is the length in inches of the ruler. If you lay the ruler end-to-end ten times in order to measure the room the uncertainty in the measurement is not +/- 1″. The uncertainty compounds with each iteration.

          • Reply to Tim Gorman Oct2 7:17am

            Tim,

            The problem is that you do not understand random variables (things which have a probability distribution) and random variates (actual values that a random variable takes) and standard deviations (the “bound” on a random variable), though I did try to explain this earlier. But I think this is the crux of Pat/Tim versus kribaez/Rich, so it is important to explore it.

            So let’s take your rewriting of my equation:

            M(t) = a M(t-1) +/- B(t;z,s)

            Now my B(t;z,s) is a random variable, so taking +/- on it makes no sense. But there is a version we can write with a +/-, though it is really just a shorthand version of mine, and it is:

            M(t) = a M(t-1) + z +/- s

            which is like Pat’s Equation 5. To take your ruler example suppose we ordered a batch of 12” rulers and they all happen to be between 11.3 and 13.3”. Then we can think of z=0.3” as the bias in the rulers and s=1”, or +/-1” if you prefer, as the uncertainty in them. If we choose and use just one ruler then after t uses the bias will have accumulated as 0.3t” and the uncertainty will have accumulated as +/-t”.

            However, we know that Pat does not use a single ruler, because he accumulates uncertainty proportional to sqrt(t). I challenge any mathematician to justify that without using the equivalent of my (*) equation which uses a random variable, and indeed Nick Stokes checked that Pat’s references do just that.

            Re Monte Carlo sampling and determinism: yes, kribaez has said that with the exact same input you get the exact same output with a GCM, so in that case you can learn nothing about the uncertainty. But he has also said that if you perturb the input just a little then because weather is mathematically chaotic then the output changes a lot, and the difference between the runs tells you something about the uncertainty.

            With respect Tim, I don’t think that you have sufficient mathematics to argue this point, but Pat has – I respect a lot of his maths even though I am worried about his assumptions. If Pat can write down some maths to argue this point against me, I’ll look very closely at it. The mathematical distinction between error and uncertainty is key to the whole debate, which cannot be resolved by words; show me the mathematics, starting from my Equation (*).

          • Rich,

            “Now my B(t;z,s) is a random variable, so taking +/- on it makes no sense”

            If B is supposed to be the uncertainty then it is not a random variable. It is an interval, not a variable value. Like with the ruler example, the uncertainty doesn’t change from iteration to iteration. The error in the output from iteration to iteration might change but not the uncertainty interval. The error should always be inside the uncertainty interval.

            “To take your ruler example suppose we ordered a batch of 12” rulers and they all happen to be between 11.3 and 13.3”. Then we can think of z=0.3” as the bias in the rulers and s=1”, or +/-1” if you prefer, as the uncertainty in them”

            The key word you used is “between”. If they are all inside an interval then that interval is the uncertainty. If some were exactly 11.3″ then those would have a bias of 0.7″. If all the rest were exactly 13.3″ long then they would have a bias of 1.3″. But if you don’t know where in the interval from 11.3″ to 13.3″ each ruler can be then those are the uncertainty interval. Your uncertainty is +1.3″, -0.7″.

            Bias (an alias for error) is *not* uncertainty. The word “between” indicates uncertainty.

            “However, we know that Pat does not use a single ruler, because he accumulates uncertainty proportional to sqrt(t). I challenge any mathematician to justify that without using the equivalent of my (*) equation which uses a random variable, and indeed Nick Stokes checked that Pat’s references do just that.”

            It doesn’t matter which ruler you use if the uncertainty interval is the same for all of them. Pat uses root-SUM-square, not root-mean-square. Root mean square is used in determining standard deviation. Just using the sum of the squares doesn’t make it into a probability distribution. Root-mean-square uses the sum of (x1^2 + x2^2 + x3^2 …..xn^2)/n (i.e. the variance). As Pat’s Eq. 6 shows, the uncertainty is just the sum, there is no division by n. Two different things.

            “But he has also said that if you perturb the input just a little then because weather is mathematically chaotic then the output changes a lot, and the difference between the runs tells you something about the uncertainty.”

            But it doesn’t tell you about the uncertainty. All it tells you is that the output of the model is based on a higher order function, e.g. squared, cubed, etc.

            “With respect Tim, I don’t think that you have sufficient mathematics to argue this point, ”

            What you think about me is irrelevant. I’ve given you the math. I know eough math to know that a Monte Carlo analysis can’t define uncertainty for a deterministic model. You have confused root-mean-square with root-sum-square and in doing so have convinced yourself that uncertainty is a probability function and not an interval. Error is a probability function and the error value will lie within the uncertainty interval. It’s that simple. And uncertainty grows with each iteration. You can’t cancel it using the central limit theory.

            The whole issue here is that if what you are trying to measure is within the uncertainty interval then you simply don’t know if you have actually measured anything. If your uncertainty interval is +/- 0.1C and you are trying to define a difference of 0.01C between two outputs then you are only kidding yourself that you actually know there really is a difference of 0.01C!

          • kribaez,

            getting to the central issue you wrote, “we had a model Y = bX with an accurately known value of b, and we had a calibration error in X of (+/-)2. I repeat for emphasis that the measurand for the calibration is the variable X. As I reported, the resulting uncertainty in Y is always (+/-)2b in this case, something which is simple to verify. Residuals are stationary.

            You’re presuming the uncertainty is due to random error. However, this condition is not known to be true. I apologize for not being clear about this in my response.

            You’re also presuming the calculation is always single-step, Y = bX, with X varying but always (+/-)2. This does not describe the effect of uncertainty on sequential calculations involving X.

            Let’s suppose your model, Y = bX and the uncertainty in X is (+/-)2 and that uncertainty is from normally distributed error.

            Then the calculation Y = bX(+/-)2 puts an uncertainty in Y of (+/-)n.

            In that case, Y_1 = bX_1(+/-)2 = Y(+/-)n_1, and Y_2 = bX_2(+/-)2 = Y_2(+/-)_n

            Now suppose the sequence of states proceeds to a final state, Y_Final = Y_F = Y_1(+/-)n+Y_2(+/-)n + … +Y_f(+/-)n. Y_F is a sum; not an average.

            Each Y includes a (+/-)_n. The uncertainty in the final Y_F = (+/-)sqrt(f*n^2).

            If the (+/-)2 is normally distributed in X, then many measurements of X can reduce the magnitude of uncertainty in X and therefore reduce the magnitude of (+/-)n. But the uncertainty in the sum of Y_i will always be >(+/-)n.

            In the case of climate models, total cloud fraction (TCF) error is systematic and non-normal (as it is in the historical air temperature measurements). The calibration error in LWCF is also not known to be normal.

            You wrote, regarding your new model, “If X is varying (as well) as some (unknown) function of time, then an operator might be asked to measure X at regular time intervals, and then record the change in X since the last measurement.

            We can now trivially rewrite our model asYi = b(Xi-1 + ΔXi ) where i is indexing the timesteps. Has the uncertainty in Y changed at all? The answer is no,…

            The way you wrote it, ΔXi = X_i(+/-)2 – X_i-1(+/-)2, which means the uncertainty in ΔXi = sqrt[(2)^2+(2)^2] = (+/-)2.8.

            Now we take Xi-1(+/-)2+ΔXi(+/-)2.8, and the new uncertainty in X = sqrt[(2)^2+(2.8)^2] = (+/-)3.5.

            However, I take your point that the uncertainty in each re-measured X is constant (+/-)2 even though the magnitude of X may vary.

            But your time-step model involves changes in X.

            The time steps in an air temperature projection involve progressive changes in Y, in each step of which the uncertainty in X enters anew.

            You wrote, “in particular you have to change the measurand for the calibration error.

            No, actually. The “measurand” is unchanged. It remains as before. Your ΔXi always remains as you have it. The uncertainty in eqn. 1 is not in ΔFi (your ΔXi). The calibration error is independent of the ΔFi.

            The (+/-)4 W/m^2 comes in as the simulation uncertainty characteristic of the model. It arises from the LWCF calibration error of CMIP5 GCMs.

            You wrote, “The calibrated measurand in the first question is an already integrated series, and the uncertainty is stationary.

            The uncertainty is stationary in your model. Not in GCM TCF error and so also not in LWCF calibration error.

            Also, your model presumed solitary calculations of Y, which is not analogous to the analysis in the paper.

            You wrote, “The calibrated measurand in the second question is the first difference of that series in statistical jargon, and the uncertainty in its sum accumulates like a random walk.

            You’re treating the calibration uncertainty as a physical error. In a predictive uncertainty analysis, one never knows how the error accumulates. Sum of physical errors is not rss of predictive uncertainty.

            You wrote, “I would therefore invite you to consider very carefully whether in your calibration of LWCF, the RMSE you have found is a measure of error in the total LWCF (the first problem) or a measure of the error on the change in LWCF each year (the second problem).

            We know what the LWCF calibration error is.

            It is derived from the annual rms of (simulation minus observed) TCF error. The LWCF rmse is not a physical error. It is a calibration error statistic. It does not sum and does not accumulate as a random walk.

            The LWCF error is a measure of the ignorance of the magnitude of simulated tropospheric thermal energy flux.

            When ΔFi enters that simulation, the impact of ΔFi on clouds is not known to better than allowed by the (+/-)4 W/m^2 uncertainty in tropospheric thermal energy flux.

            Cloud response to CO2 forcing is not known to better than allowed by (+/-)4 W/m^2.

            It’s a straight-forward concept, kribaez. Simulation TCF error and LWCF rmse say that one cannot know how clouds respond to GHG forcing.

            That means one cannot know how the air temperature changes with GHG emissions.

            The resolution of the GCMs are too coarse to resolve the effect of GHGs on the climate.

            I’ll repost an analysis I posted elsewhere already. It gives another approach to the problem of model resolution in terms of CO2 forcing.

          • kribaez,

            this illustration might clarify the meaning of (+/-)4 W/m^2 of uncertainty in annual average LWCF.

            The question to be addressed is what accuracy is necessary in simulated cloud fraction to resolve the annual impact of CO2 forcing?

            We know from Lauer and Hamilton that the average CMIP5 (+/-)12.1% annual cloud fraction (CF) error produces an annual average (+/-)4 W/m^2 error in long wave cloud forcing (LWCF).

            We also know that the annual average increase in CO2 forcing is about 0.035 W/m^2.

            Assuming a linear relationship between cloud fraction error and LWCF error, the (+/-)12.1% CF error is proportionately responsible for (+/-)4 W/m^2 annual average LWCF error.

            Then one can estimate the level of resolution necessary to reveal the annual average cloud fraction response to CO2 forcing as, (0.035 W/m^2/(+/-)4 W/m^2)*(+/-)12.1% cloud fraction = 0.11% change in cloud fraction.

            This indicates that a climate model needs to be able to accurately simulate a 0.11% feedback response in cloud fraction to resolve the annual impact of CO2 emissions on the climate.

            That is, the cloud feedback to a 0.035 W/m^2 annual CO2 forcing needs to be known, and able to be simulated, to a resolution of 0.11% in CF in order to know how clouds respond to annual CO2 forcing.

            Alternatively, we know the total tropospheric cloud feedback effect is about -25 W/m^2. This is the cumulative influence of 67% global cloud fraction.

            The annual tropospheric CO2 forcing is, again, about 0.035 W/m^2. The CF equivalent that produces this feedback energy flux is again linearly estimated as (0.035 W/m^2/25 W/m^2)*67% = 0.094%.

            Assuming the linear relations are reasonable, both methods indicate that the model resolution needed to accurately simulate the annual cloud feedback response of the climate, to an annual 0.035 W/m^2 of CO2 forcing, is about 0.1% CF.

            To achieve that level of resolution, the model must accurately simulate cloud type, cloud distribution and cloud height, as well as precipitation and tropical thunderstorms.

            This analysis illustrates the meaning of the (+/-)4 W/m^2 LWCF error. That error indicates the overall level of ignorance concerning cloud response and feedback.

            The CF ignorance is such that tropospheric thermal energy flux is never known to better than (+/-)4 W/m^2. This is true whether forcing from CO2 emissions is present or not.

            GCMs cannot simulate cloud response to 0.1% accuracy. It is not possible to simulate how clouds will respond to CO2 forcing.

            It is therefore not possible to simulate the effect of CO2 emissions, if any, on air temperature.

            As the model steps through the projection, our knowledge of the consequent global CF steadily diminishes because a GCM cannot simulate the global cloud response to CO2 forcing, and thus cloud feedback, at all for any step.

            It is true in every step of a simulation. And it means that projection uncertainty compounds because every erroneous intermediate climate state is subjected to further simulation error.

            This is why the uncertainty in projected air temperature increases so dramatically. The model is step-by-step walking away from initial value knowledge further and further into ignorance.

            On an annual average basis, the uncertainty in CF feedback is (+/-)144 times larger than the perturbation to be resolved.

            The CF response is so poorly known, that even the first simulation step enters terra incognita.

  84. Dr Frank,
    A continuation of my previous post.

    You attach some importance to the fact that the TCF shows evidence of high lag-1 autocorrelation in individual models.
    It is of some note that the comparison of observed to modeled was over a period (1980 to 2004) when temperatures were rising sharply in the models.
    With the LTI model, I ran a projection of a 1% p.a. increase in CO2 for 70 years and then a constant forcing thereafter. I left in the system a random annual net flux error drawn from a N(0,0.3^2). I then calculated the flux feedback from cloud forcing as a linear function of Temp using 0.5 Watts/m2/K. Unsurprisingly, this series showed very high lag-1 autocorrelation. The reason is simply that the temperature over this period was rising almost linearly in the model, which meant that the flux feedback from cloud forcing was rising almost linearly by assumption.
    Any series which is approximately linear in time will show an autocorrelation close to unity.
    It is important to note that I also separately checked the flux feedback from cloud forcing over the period AFTER the temperature stabilised into small oscillations (from the random annual net flux error around a constant value). Even under these circumstances, the flux feedback series showed autocorrelation of 0.91. This arises directly from the autocorrelation in the temperature series, which is determined by the second term in Eq 3a above.
    So autocorrelation in the flux series is unsurprising. The autocorrelation which you found in the residuals are also unsurprising, especially since the temperature over the period was rising sharply. Your observation may (and probably does) indicate an error in the effective cloud feedback, but such error does not propagate like an integrating series in flux.
    In summary, I don’t doubt that there is a combination of systemic bias in TCF and poor estimation of temperature-dependent cloud feedback in the GCMs. However, I can still find no justification for your belief that uncertainty in TCF can be translated into an uncertainty in forcing, and even less that it propagates as an integrating series.

      • Pat,
        Thank you for this. My bad reading. And I agree with you that it implies deterministic or structural error in the models. I understood you to be drawing an inference that the autocorrelation supported your treatment of uncertainty propagation.

  85. kribaez: In summary, I don’t doubt that there is a combination of systemic bias in TCF and poor estimation of temperature-dependent cloud feedback in the GCMs. However, I can still find no justification for your belief that uncertainty in TCF can be translated into an uncertainty in forcing, and even less that it propagates as an integrating series.

    I think you have made a case that uncertainty is greater than the estimate provided by Pat Frank.

    As to the “one size fits all”, I would counter that this is a well-done first approximation to GCM model uncertainty related to uncertainty in one of the parameters, and it can be improved upon eventually, but not really soon. I look forward to reading more work on GCM prediction uncertainty.

Comments are closed.