Additional Comments on the Frank (2019) “Propagation of Error” Paper

From Dr Roy Spencer’s Blog

September 12th, 2019 by Roy W. Spencer, Ph. D.

NOTE: This post has undergone a few revisions as I try to be more precise in my wording. The latest revision was at 0900 CDT Sept. 12, 2019.

If this post is re-posted elsewhere, I ask that the above time stamp be included.

Yesterday I posted an extended and critical analysis of Dr. Pat Frank’s recent publication entitled Propagation of Error and the Reliability of Global Air Temperature Projections. Dr. Frank graciously provided rebuttals to my points, none of which have changed my mind on the matter. I have made it clear that I don’t trust climate models’ long-term forecasts, but that is for different reasons than Pat provides in his paper.

What follows is the crux of my main problem with the paper, which I have distilled to its essence, below. I have avoided my previous mistake of paraphrasing Pat, and instead I will quote his conclusions verbatim.

In his Conclusions section, Pat states “As noted above, a GCM simulation can be in perfect external energy balance at the TOA while still expressing an incorrect internal climate energy-state.

This I agree with, and I believe climate modelers have admitted to this as well.

But, he then further states, “LWCF [longwave cloud forcing] calibration error is +/- 144 x larger than the annual average increase in GHG forcing. This fact alone makes any possible global effect of anthropogenic CO2 emissions invisible to present climate models.”

While I agree with the first sentence, I thoroughly disagree with the second. Together, they represent a non sequitur. All of the models show the effect of anthropogenic CO2 emissions, despite known errors in components of their energy fluxes (such as clouds)!

Why?

If a model has been forced to be in global energy balance, then energy flux component biases have been cancelled out, as evidenced by the control runs of the various climate models in their LW (longwave infrared) behavior:

Frank-model-vs-10-CMIP5-control-runs-LW-550x458Figure 1. Yearly- and global-average longwave infrared energy flux variations at top-of-atmosphere from 10 CMIP5 climate models in the first 100 years of their pre-industrial “control runs”. Data available from https://climexp.knmi.nl/

Importantly, this forced-balancing of the global energy budget is not done at every model time step, or every year, or every 10 years. If that was the case, I would agree with Dr. Frank that the models are useless, and for the reason he gives. Instead, it is done once, for the average behavior of the model over multi-century pre-industrial control runs, like those in Fig. 1.

The ~20 different models from around the world cover a WIDE variety of errors in the component energy fluxes, as Dr. Frank shows in his paper, yet they all basically behave the same in their temperature projections for the same (1) climate sensitivity and (2) rate of ocean heat uptake in response to anthropogenic greenhouse gas emissions.

Thus, the models themselves demonstrate that their global warming forecasts do not depend upon those bias errors in the components of the energy fluxes (such as global cloud cover) as claimed by Dr. Frank (above).

That’s partly why different modeling groups around the world build their own climate models: so they can test the impact of different assumptions on the models’ temperature forecasts.

Statistical modelling assumptions and error analysis do not change this fact. A climate model (like a weather forecast model) has time-dependent differential equations covering dynamics, thermodynamics, radiation, and energy conversion processes. There are physical constraints in these models that lead to internally compensating behaviors. There is no way to represent this behavior with a simple statistical analysis.

Again, I am not defending current climate models’ projections of future temperatures. I’m saying that errors in those projections are not due to what Dr. Frank has presented. They are primarily due to the processes controlling climate sensitivity (and the rate of ocean heat uptake). And climate sensitivity, in turn, is a function of (for example) how clouds change with warming, and apparently not a function of errors in a particular model’s average cloud amount, as Dr. Frank claims.

The similar behavior of the wide variety of different models with differing errors is proof of that. They all respond to increasing greenhouse gases, contrary to the claims of the paper.

The above represents the crux of my main objection to Dr. Frank’s paper. I have quoted his conclusions, and explained why I disagree. If he wishes to dispute my reasoning, I would request that he, in turn, quote what I have said above and why he disagrees with me.

Get notified when a new post is published.
Subscribe today!
5 1 vote
Article Rating
350 Comments
Inline Feedbacks
View all comments
John M Brunette, Jr.
September 13, 2019 8:16 am

I still remain unconvinced that it’s even possible to model our climate at all, let alone the effect of CO2 with in our atmosphere. There’s just too many moving parts! You have the external forces we don’t understand, from the sun, and we’ve been experiencing changes there like we’ve never been able to see or measure before. You have the ocean currents, and Judith’s Curry’s latest paper on that and the effect from solar activity. You have a lot sorts of moisture spewed into the air through stacks, exhaust pipes, and of course major irrigation that increases constantly. You have the greening from increased CO2, and it’s effect on everything. Planes delivering some CO2 up high, while also delivering moisture and manufacturing false cloud cover in the process.

I know of no computer that model all of this effectively. I know people always consider this metric or that metric to be weather, not climate, but after 30 or 40 years or more, of these predictions, one would think we would have some model, any model, that could get our world figured out? I certainly don’t think it’s possible. If the old adage about a butterfly flapping it’s wings in Peru, or wherever, can effect the weather here in MN, I’d say we are going to need a lot of processing power to figure out how our climate will be affected when for all we know, we’re causing more or less of these butterflies to be created, or the flap more or less when they are happy or sad! And then we imagine that this little trace gas, is the culprit of a global disaster! Based on settled science, no less!

It seems we can’t even agree on the terms to model these things…. but we should send the world back into poverty, just in case… of something, that clearly is no where near as bad as it was touted to be. We need to put this whole thing on hold, and come up with a clear way to demonstrate real measurable “unfudgeable” results that hold up year over year. Then we can start tweaking parameters that make the difference when we have a model that actually works. Right now, all we have are wild theories, and none of them seem to work, but we call it settled science!

angech
September 13, 2019 8:40 am

Good to spar, better to let others do it for you. Watching Roy v Pat with great interest.
Takeaway so far is Pat contending that he is just showing the propagation of systemic error in GCM’s quite different to random error which both Pat and an oilman point out would tend to disappear with more runs.
Mathematically the systemic error or its root mean square used by Pat over a 65 year period has to lead to a large positive and negative uncertainty range as he shows.
He specifically states it is only the maths in computer models not the physics that he is discussing not the real or observational world.
Mathematically this is sound.
Practically it’s relevance is debatable but it does open up a can of worms about the way computer programmes now project uncerainty into the future.
Unless this is specifically addressed and tightened up by the programmers, Mosher would agree as he supports more open coding, and we can all see the assumptions and uncertainties going into these programs then Pat is free to make his comments and those disagreeing need to show that their programmes do not have the systemic bias in he seems to have identified.

Bindidon
Reply to  angech
September 13, 2019 9:46 am

angech

Good point. Sounds good to the zero-level layman in the field discussed here.

Windchaser
Reply to  angech
September 13, 2019 9:54 am

He specifically states it is only the maths in computer models not the physics that he is discussing not the real or observational world.
Mathematically this is sound.

Is it? He’s talking about the actual cloud forcing uncertainty, as measured in the real world, and saying this is 4W/m2/year, not 4W/m2. As a result, if you integrate this value with respect to time, the uncertainty in cloud forcing increases, year after year.

So is it really physically plausible that the real world cloud forcing uncertainty can increase, year-over-year, without bound?

Of course if you take that result and plug it into the model, you’ll get nonsense. It’s incorrect in the real world.

Reply to  angech
September 13, 2019 5:55 pm

“He specifically states it is only the maths in computer models not the physics”
But he says nothing about the maths actually in computer models, and I don’t think he knows anything about it. He made up some maths.

Reply to  Nick Stokes
September 13, 2019 10:30 pm

No, I developed an equation that accurately emulates GCM air temperature projections, Nick.

That’s hardly making stuff up.

As usual, you mislead to confuse.

Andrew Kerber
September 13, 2019 10:01 am

My take on this goes in another direction entirely. We continually see temperature measurements published to .1 or .01 degrees. But my own observation is that realistically, the measurement could not possibly have any significant digits past the decimal place, just from the fact that temperature can vary dramatically within a few feet. Yet we see an anomaly calculation to .01 or .001 degrees, meanwhile claiming that that the 20th century warming of about 2 degrees is ‘unprecedented’. With a realistic minimum confidence interval of +-1 degree, and probably more like +-2 degrees, how can we do anything but laugh at claims of ‘unprecedented warming’ and an anomaly measured to hundreths of a degree?

Windchaser
Reply to  Andrew Kerber
September 13, 2019 10:52 am

They present the temperature anomaly to hundredths of a degree, not the temperature.

While temperature varies rather quickly with altitude, shade, distance to the coast, etc., temperature anomaly, the integral of how temperature changes with respect to time, is a bit more constant. There are a bunch of studies of this.

Reply to  Windchaser
September 13, 2019 12:13 pm

You ignore the problem of how the base was calculated to obtain the anomaly and what measurement error propagation calculations are included in that. There is also the problem of just what an average temperature tells you about enthalpy.

Reply to  Windchaser
September 13, 2019 10:28 pm

In taking the difference to get an anomaly, the uncertainty in the temperature and the root-mean-error in the base-period average must be combined in quadrature, Windchaser.

The uncertainty in the anomaly is always larger than the uncertainty in the temperature.

This common practice is entirely missing in the work done by the compilers of the GMST record. It’s hugely negligent.

Those folks assume all measurement errors are random, and just blithely average them all away. It’s totally self-serving and very wrong. As demonstrated by the systematic non-random measurement errors revealed in sensor calibration experiments.

I have published on that negligence, here (1 mb pdf)

John_QPublic
September 13, 2019 11:15 am

A challenge to Pat Frank :

Windchaser Pointed out that Dr. Brown contacted Dr. Lauer, and Dr. Lauer indicated that the 4 W per square meter uncertainty was not intended to be premised on any given timeframe.

https://patricktbrown.org/2017/01/25/do-propagation-of-error-calculations-invalidate-climate-model-projections-of-global-warming/#comment-1443

It is clear that Dr. Lauer derived this statistic from 20 year multi model average annual means. This implies a one-year timeframe may be reasonable. The square root of N in the divisor may be sufficient to control the period. Could also be the case that this is a fixed uncertainty bar on all measurements at any given time.

Could you comment more on the statistic and how you justify using it as an annual uncertainty?

Paul Penrose
Reply to  John_QPublic
September 13, 2019 12:12 pm

An average implies a division operation, does it not? What was the divisor? If it was time, then it has to be 4 W per square meter per something, depending on the unit of the divisor. So what was the divisor?

Windchaser
Reply to  Paul Penrose
September 13, 2019 12:51 pm

articulations pointing vectors drawn backwards for x and y axes

MultiFramePoseData doesn’t save/load correctly for Hubble data?

Say you have 100 people. Let’s say the average has $5, and let’s say there’s a little bit of variability, so there’s an uncertainty of +/- $1. In other words, if we randomly selected a person, ~95% of them would have between $4 and $6. That’s what that uncertainty means.

From these statistics you can say “there is $5/person, on average”, and then multiply by the number of people to get the total number of dollars. This is essentially integrating the average over the number of people to get the total number of dollars.
Or you can say “an average person has $5”. That’s the same thing.

But you cannot say “the average person has $5/person”, as that would imply that if you had a group of 4 people, the average person would have $20. Ex:

4 persons * “average has $5/person”= “average has $20”.

——————-

Now imagine you added an extra 100 people, basically as a copy of the first 100 people. Person #101 has the same $ as person #1. Person #102 has the same $ as person #2. And so on. A copy of the people and their $.

You now have 200 people, and the average person still has $5, right? Your average did not change just because you added more people.

Likewise, the uncertainty also stays the same. It’s still +/- $1. Both the uncertainty and the average do not change as you add more people.

So if we go out in the real world and measure the cloud forcing over some 20 years, and we get a value and some uncertainty, then the next year, we can still expect that the cloud forcing is going to be similar. And the year after that. And the year after that. (If it was properly measured and properly sampled). There may be uncertainty, but if you’ve measured it right, that uncertainty stays the same, month after month, year after year, decade after decade. The uncertainty already accounts for any variation, and 95% of the time, at any given moment, the cloud forcing should fall within that measured value + its uncertainty.

Just as doubling the number of people does not double the average $ or the uncertainty, doubling the number of years should not double the cloud forcing or its uncertainty.

And while there is still an uncertainty, and it still propagates through any proper mathematical calculation. But it propagates through as W/m2, not as W/m2/time.

Windchaser
Reply to  Windchaser
September 13, 2019 1:23 pm

Sorry, ignore the first-two lines of that last comment. That’s what I get for copy-pasting from Notepad where I’m keeping some unrelated notes. ><

JohnQPublic
Reply to  Windchaser
September 13, 2019 1:38 pm

I think you are put into it too much emphasis on the units. I am not in a position to look at the paper now, but I do not believe that Pat Frank needs the units of flux per time. He just puts in the flux. When you do linear differential mathematics you just calculate the next state, and there are no considerations for the units of pool time.

Mortimer Snerd
Reply to  Windchaser
September 13, 2019 2:10 pm

Windchaser,

In your example, $5/person × 4 person = $20; not $20/person. The units (person) still cancel out.

Matthew R Marler
Reply to  Windchaser
September 13, 2019 11:06 pm

Windchaser: In other words, if we randomly selected a person, ~95% of them would have between $4 and $6. That’s what that uncertainty means.

And make the further assumption that the mean is 5, and for simplicity let 4 and 6 be the known range, instead of a CI.

So one person has between 4 and 6, but you don’t know how much, only that the expected value is 5.

Two people have a total between 8 and 12, but you don’t know how much, except that the mean of 2 is 10.

Three people have a total between 12 and 18, but you don’t know how much,, except that the mean is 15.

N people have a total between N*4 and N*6, but you don’t know how much except that the mean total for N is N*5. When the value added is unknown, but represented by an uncertainty interval, the uncertainty of the sum grows.

If (4,6) is a CI, then you have to work with the variance estimate, and sum the variances, and take the square root of the sum of variances (assuming independence, which Pat Frank does not do — he calculates the effect of the correlation.) But the result is still that the uncertainty in the total grows with N.

Reply to  Matthew R Marler
September 14, 2019 12:56 am

“take the square root of the sum of variances (assuming independence, which Pat Frank does not do”
He took the square root of the sum of variances, as in Eq 6, or eq 9 of the SI. That assumes independence. That assumes independence, although he doesn’t say so. There is some earlier talk of autocorrelation, but I cannot see that it is used anywhere in the calculations. No values are stated there.

Matthew R Marler
Reply to  Matthew R Marler
September 14, 2019 8:04 am

Nick Stokes: There is some earlier talk of autocorrelation, but I cannot see that it is used anywhere in the calculations.

Although not explicitly displayed, the correlation would be used in computing the covariance of the step(i) and step(i+1) uncertainties that is displayed in equation 4. The exponent 2 on the covariance might be a typo or a convention different from what I have read.

If he had omitted the correlation in this equation, then he would have a serious under-estimate of the final uncertainty. He does not say so (or I missed it or forgot it) but the partial autocorrelation for lags greater than 1 is probably 0 or close to it, so only the lag 1 covariances are needed.

Reply to  Matthew R Marler
September 15, 2019 10:12 pm

Matthew, I use a multi-model, multi-year average calibration error statistic.

It is static and does not covary. There’s no covariance term to add the the propagation.

John_QPublic
Reply to  Paul Penrose
September 13, 2019 12:51 pm

Paul:

There was a lot of averaging going on. Over 20 years over all grid points, and finally over all the models.

John_QPublic
Reply to  John_QPublic
September 13, 2019 1:46 pm

…Interestingly, we do not retain all of those modes of averaging in our units to the point I just made to wind chaser about over emphasizing units. I do not see anything about per grid point, per model, etc. So the fact that it does not have “per year” is not as significant as some people make it out to seem.

John_QPublic
Reply to  Paul Penrose
September 13, 2019 3:33 pm

The final average is per model. I do not see that in the units.

Interestingly, we do not retain all modes of averaging in our units to the point I just made to wind chaser about over emphasizing units. I do not see anything about per grid point, per model, etc. So the fact that it does not have “per year” is not as significant as some people make it out to seem.

Windchaser
Reply to  John_QPublic
September 13, 2019 7:54 pm

I do not see anything about per grid point, per model, etc. So the fact that it does not have “per year” is not as significant as some people make it out to seem.

When Dr. Frank does his calculation of the forcings, he adds in the uncertainty at each timestep. Since each timestep is a year long (in his calculation), and he re-adds this +/-4W/m2 in at each timestep, he’s adding it in, each year, and treating it as a yearly value.

If you have a time-based equation, and you add something to the iterative function once per year, you’re treating it as a yearly value. A fixed value would get added once at the beginning, a per-year value gets added in each year.

Note that his results come out completely differently if you change the length of the timesteps to, say, months, and still add this uncertainty in at each timestep. (Because then the equation would add in 4W/m2/month, not 4 W/m2/year).

Reply to  Windchaser
September 13, 2019 10:22 pm

It’s a sequential series of calculations, Windchaser, with an uncertainty associated with every single input into every single step of the calculation.

That uncertainty must therefore be propagated through every step of the calculation.

There’s no mystery or ambiguity about it. Doing so is standard uncertainty analysis in every data-oriented experimental science.

angech
Reply to  Windchaser
September 14, 2019 1:37 am

Windchaser
“Note that his results come out completely differently if you change the length of the timesteps to, say, months, and still add this uncertainty in at each timestep. (Because then the equation would add in 4W/m2/month, not 4 W/m2/year).”

Windchaser this is something that you raise now and that Roy Spencer mentioned last article [wrongly, sorry Roy] and that Nick Stokes repeatedly uses.

There are 2 different ways to approach the 4W/m-2.
Remembering that a plain 4 W/m-2 simply means a unit of energy over time even though the word time does not appear in the subscript it is implicit and explicit in the term Watts.
The unit of power (the watt) in SI, the International System of Units. 1 watt = 1 joule per second = 1 newton meter per second = 1 kg m2 s-3.

The “watt per square meter” is the SI unit for radiative and other energy fluxes in geophysics. The “solar constant”, the solar irradiance at the mean earth-sun distance is approximately 1370 W m-2. The global and annual average solar irradiance at the top of the earth’s atmosphere is one-fourth of the solar constant or about 343 W m-2.

The first step of Nicks is to use either, both quite different definitions, when he is arguing and then swaps them in place of each other.

Hence the “solar constant” is repeatedly referred to as a timeless unit by both he and ATTP when he supports Nick’s argument.
Note that the solar constant does have a time dimension to it and an area dimension despite numerous denials by people who should know better. A flux is not just a number without units. Secondly it is not actually a constant, it varies as the distance from the sun.

So the timestep is 1 second, always.

Pat Frank uses the 4W/m-2. iin his paper in a totally different way. He is discussing the
CMIP5 Model Calibration Error in Global Average Annual Total Cloud Fraction (TCF).
The words Global Average Annual mean just that. CMIP5 , not Frank
He and they are comparing annual, not monthly, not per second factors and giving the calibration error as a percentage of the average annual rate per second.
Very important.

Three things spring to mind.
You say,
“his results come out completely differently if you change the length of the timesteps to, say, months, and still add this uncertainty in at each timestep.”
You ignore the fact that when you change the timestep you have to change the amount of energy received in that time step by the same factor, The total amount of energy in 1 month is 1/12th of that in a year = 1/3 of a W/m-2/month.

Further your contention [“and still add this uncertainty in at each timestep”] is wrong. In that you now use a compounding rate for different times lengths and assume that the result should be the same .Roy made the same compounding comment and inbuilt error.
If you use a month as your basis you have to assess monthly. You have to add each months energy received together for 12 months, not compound them. The sun does not keep increasing heat each month. The earth is no hotter after 12 individual months in a row than the same year. This is one of the basic ways of avoiding error. Check that the end point agrees with the assumptions that your calculations give.

Thirdly that Nick Stokes should be called out each and every time that he purposely swaps terms. This has been quite difficult for years as he tends to parse his arguments, deliberately, into one part which is quite true and one that is ever so subtly wrong.
A subtle reminder is him attacking Pat Frank for using mean root squares .
This is a standard way of assessing the size of an error.
Nick and others still have comments up claiming that since a root mean square is always positive that Frank cannot use the figure as a plus or minus for the standard deviation in uncertainty.
This is despite this being the typical way that uncertainty is quantified. A deviation plus or minus of the mean root square.
Rant finished.

Reply to  John_QPublic
September 13, 2019 10:18 pm

You’ll find my reply here, John.

https://patricktbrown.org/2017/01/25/do-propagation-of-error-calculations-invalidate-climate-model-projections-of-global-warming/#comment-1451

Specifically from the post: Here’s Prof. Lauer as provided: “The RMSE we calculated for the multi-model mean longwave cloud forcing in our 2013 paper is the RMSE of the average *geographical* pattern. This has nothing to do with an error estimate for the global mean value on a particular time scale.

Prof Lauer is, of course, correct. The crux issue is that he referred the error to a “particular time scale.” The LCF error they calculated is an error estimate for the global mean simulated value on an averaged time-scale. The mean error is a representative average time-scale error, not a particular time-scale error.

It should be clear to all that an average of errors says nothing about the magnitude of any particular error, and that an annual average says nothing about a particular year or a particular time-range.

Following from the step-by-step dimensional analysis above, the Lauer and Hamilton LCF rmse dimension is ±4 Wm⁻² year⁻¹ (grid-point)⁻¹, and is representative of CMIP5 climate models. The geospatial element is included.

John Q Public
Reply to  Pat Frank
September 14, 2019 8:48 am

Thank you, Pat. I did make a point about “per grid point”, etc. Even Lauer did not retain those units, but everyone knows they are there.

michael hart
September 13, 2019 11:43 am

“Importantly, this forced-balancing of the global energy budget is not done at every model time step, or every year, or every 10 years.”

Similar things are done in other fields, such as protein-folding.

The model just can’t do it on its own. So it is helped (forced) to come to what is known to be the “correct” answer. In many instances this may well end up with something approximating the truth but does not advance the science because the model will still fail on problems where the answer has not already been determined by other means.

JohnQPublic
Reply to  michael hart
September 13, 2019 1:15 pm

True. With protein folding [I presume] direct empirical validation is possible. What is powerful about Pat Frank’s work is that in fact he used real, actual, experimental, validated, and excepted data to determine the uncertainty.

Reply to  JohnQPublic
September 13, 2019 5:37 pm

” in fact he used real, actual, experimental, validated, and excepted data to determine the uncertainty”
Really? Where?

The Lauer figure of 4 W/m2 is the difference between GCM results and a measurement. Or more precisely, the correlation of the difference.

Reply to  Nick Stokes
September 13, 2019 10:07 pm

The ±4 W/m^2 is a calibration error statistic, Nick.

Correlations are dimensionless.

It’s very disingenuous of you to transfer a nearby description over to an incorrect object.

You know exactly what you’re doing. You’re obfuscating to confuse, and it’s shameful.

Reply to  Pat Frank
September 14, 2019 1:06 am

The rmse is derived directly from the correlation; your entire basis for the 4 W/m2 is this statement of Lauer:
“For CMIP5, the correlation of the multimodel mean LCF is 0.93 (rmse = 4 W m-2)”
I queried JQP’s assertion
“in fact he used real, actual, experimental, validated, and excepted data to determine the uncertainty”
No answer yet.

Reply to  Pat Frank
September 15, 2019 10:07 pm

Nick, “your entire basis for the 4 W/m2 is this statement of Lauer:
“For CMIP5…

No it’s not. My basis is the description in L&H of how they calculated their error.

Stevek
September 13, 2019 11:51 am

To fix this problem with the models they will have to assume a certain distribution of cloud forcing error. Perhaps saying the error each year is normally distributed and independent of previous year would work, but that assumption needs to be validated with historical actual cloud data.

Windchaser
Reply to  Stevek
September 13, 2019 1:25 pm

These measurements, of cloud forcing and uncertainty, come from historical cloud data.

Windchaser
September 13, 2019 1:22 pm

You all seem to be missing the point. Dr. Frank has said that he can emulate the output of GCM’s relation to CO2 using a linear equation.

Yes, more specifically, he’s saying that there’s an emulated linear relationship between the forcings and the temperature.

Then he adds an uncertainty for the cloud forcing. Okay, so far, so good. This would (normally) result in an uncertainty for the temperature, right? There’s a direct feed-through: if there’s a linear relationship between forcings and temperature, then there’s a linear relationship between forcing uncertainty and temperature uncertainty.

But Dr. Frank then makes the forcing uncertainty grow over time. And grow. And grow. And grow.

In the real world, is it plausible that the cloud forcing uncertainty grows forever, ad infinity? Can it really have a value of +/-1000 W/m2, when you look at a long enough time period? This would be enough to boil the oceans or freeze the planet to 0K.

No! It can’t. This idea that cloud forcings could be somewhere between {huge ridiculous numbers} is itself wrong. But that’s what Dr. Frank’s paper says: that the cloud forcing uncertainty grows forever, without limit.

John_QPublic
Reply to  Windchaser
September 13, 2019 3:48 pm

Maybe it could not grow ad infinity. But certainly I could grow two or three years in a row or maybe even 10. In that case one would expect some compensatory mechanism in the models to take over and damp and out the signal. clearly the emulator is not doing that. That does beg the question of whether the full nonlinear capable model would do that, but I think that is left up to the modelers to answer Pat Frank.

Stevek
Reply to  John_QPublic
September 13, 2019 4:01 pm

In finance for example a common model of interest rates employs a mean reversion. Meaning interest rates in reality don’t grow unlimited, they add a mean reversion to force the rate to return to a mean.

John Q Public
Reply to  John_QPublic
September 13, 2019 7:59 pm

apologize for the mistakes- I did this while boarding a plane using dictation. I tried to proof read it, but did not catch all the errors. “ad infinitum”, dampen (not damp and), etc.

Stevek
Reply to  Windchaser
September 13, 2019 3:56 pm

Looks like Dr. Frank gives an uncertainty at the end of 100 years which is huge. What do the creators of the models say the error distribution is and how do they arrive at it ?

Reply to  Stevek
September 13, 2019 7:10 pm

They do ensemble tests and publish the results.

John Q Public
Reply to  Nick Stokes
September 13, 2019 7:18 pm

The ensemble tests are only tests of precision and only using the models themselves- no external analysis of uncertainty as Pat Frank. did.

Reply to  John Q Public
September 14, 2019 1:10 am

People here love parroting precision and accuracy, but to no useful effect. You can’t determine accuracy of predictions of things that haven’t happened. The ensemble tests test exactly what is being talked about here – the propagation of uncertainty. They tell you how much a change in present state affects the future. There is no better test than to see what GCMs actually do to variation, via ensemble.

John Q Public
Reply to  John Q Public
September 14, 2019 9:53 am

You may want to call it parroting, Nick, but the point is the only error the ensemble test produces is error internal to the models themselves. Pat Frank is comparing the model to a real, actual, measure, verified error statistic. In other words he in a sense is comparing the model to reality, nit just checking the model for internal errors.

slow to follow
Reply to  John Q Public
September 15, 2019 4:18 am

A long while ago I tried to find the formal proven theoretical framework by which the process, utility and results of these ensemble GCM tests can be evaluated. Does anybody have a current reference? My recollection is that is doesn’t exist but if it does, it would be good to know.

Reply to  John Q Public
September 15, 2019 9:04 pm

Nick, “The ensemble tests test exactly what is being talked about here – the propagation of uncertainty.

No they don’t. Ensemble tests are about variability of model runs; how well models agree. That has no connection with propagation of error.

In fact, ensemble tests provide no bearing on accuracy at all. Or on reliability.

Reliability is what propagation of error reveals.

Model ensemble tests have nothing to say about whether we should believe climate simulations.

Reply to  Windchaser
September 13, 2019 9:52 pm

The LWCF average annual uncertainty is always ±4W/m^2 Windchaser. It doesn’t grow.

However, as it is a model calibration error statistic representing theory error, the model injects it into every step of a simulation.

That means the uncertainty in projected temperature, not in forcing, grows with the number of simulation steps. All the projection uncertainty bounds I describe are ±T.

Reply to  Pat Frank
September 14, 2019 1:14 am

“The LWCF average annual uncertainty is always ±4W/m^2 “
No, you have been saying emphatically that it is ±4W/m^2/year/model. But I see this dimensioning is quite variable in the SI. Eq 9 gives it as 4 W/m2, and in fact the units get messed up if you try to add a /year (let alone a /unit).

Reply to  Nick Stokes
September 15, 2019 10:04 pm

The SI has no eqn 9, Nick.

The derivation is in eqns. 6-1 through 6-4. Dimensional analysis necessarily produces a global ±4W/m^2/year/model.

I’m impressed with a level of professional acuity, yours, that denies averages their denominator.

Paramenter
September 13, 2019 1:25 pm

Hey Javier,

If we are talking about the models the answer is clearly not. The models are programmed to respond essentially to GHG changes and temporarily to volcanic eruptions, and little else.

Methinks that’s not quite right. Models respond to several energy fluxes described in such models. But because much greater energy fluxes are in equilibrium, relatively small CO2 signal may tip the balance and push trend upwards. That’s my understanding what Dr Spencer is saying about models and their ‘test runs’ – balancing several energy flows. The question is now what if uncertainty associated with some energy fluxes (i.e. cloud forcing) is far larger than entire signal due to CO2? Signal due to CO2 is simply lost in the noise of uncertainty. Even worse – because this ‘uncertainty calibration error’ propagates longer simulation runs wider uncertainty becomes.

September 13, 2019 1:38 pm

I liked the analogy of the ruler used by one of the commentators.

Let me see whether I can use this analogy correctly here (repeating what has already been said probably):

Say we have a one-meter ruler whose length we are not sure of — the ruler is labeled as a one-meter ruler, but we know that the person who made it was impaired somehow and could have made it 4 cm longer or 4 cm shorter.

We want to use this ruler to measure the sizes of watermelons in a contest to declare someone the winner of the largest-watermelon award. It’s the only ruler available to us, and so we are stuck with it.

Six judges use the ruler to measure a series of melons, and the six judges come up with roughly the same results. We know that rulers, in general, are accurate within, let’s say one centimeter. This +/- centimeter is one sort of error, right? The judges, as a group, have great precision, because they all are never off by more or less than one centimeter — they always agree within +/- one centimeter.

But the uncertainty of the ruler’s ACTUAL length is an overriding uncertainty, no matter how many judges come within the +/-centimeter precision of their measures.

The cloud-forcing uncertainty in climate models (i.e., climate-forecasting rulers) is such an uncertainty, yes?, no?, maybe?, sort of?

If not even close, then forgive my confusion and my confusing others — try to forget you ever read this. If correct or close, then yipee, I get it.

Reply to  Robert Kernodle
September 13, 2019 3:11 pm

Yes, but the error with the ruler is a scale error. It doesn’t compound.

Reply to  Nick Stokes
September 13, 2019 6:28 pm

I’d like another assessment of that.

Say we measure a melon, and observe a certain number of gradations on the ruler. Each gradation has an uncertainty about it, because we cannot be confident that a centimeter is a real centimeter.

We don’t know when the gradation is or isn’t a real centimeter. A centimeter on the ruler could be off by some millimeters either way, and, as we continue to measure, not only are we uncertain of exactly how many centimeters we are REALLY measuring, but also we are not certain, whether the rulers measure represents any one melon’s true length.

The longer the melon, the more gradations of uncertain centimeters accumulate. How is this not compounded with each longer melon?

The measuring instrument itself does not have a knowable value that reflects the real quantity that it measures. With each increasing length, our uncertainty about what this measurement truly represents becomes greater and greater. We have less and less confidence about these measurements over a greater time, even though the output of the measuring instrument falls within certain bounds.

Remember, we don’t know what a gradation on the faulty ruler represents. A centimeter could be 0.6 cm or 1.4 cm or 1 cm, but we cannot know with confidence. And as the distance we measure increases, our uncertainty about what the ruler measures increases. Maybe 2 cm, as measured by the faulty ruler, is any combination of those three possibilities of 0.6, 1.4, or 1 in a sum, that is:

0.6 + 0.6, meaning 2 was really 1.2
0.6 + 1, meaning 2 was really 1.6
1.4 +1.4, ……………….. really 2.8
1 +1, ……………………. actually 2 (but we can’t know)
1 + 1.4, ………………… really 2.4
0.6 +1.4, ………………. actually 2 (it happens)

Then 3 cm is any combination of combination of those, and so on and so on.

My head is a jumble, but I hope I’m making my point somewhat coherently, which is there seems to be some sort of propagation here, as we go longer and longer out with an instrument whose scale’s correspondence to reality is not certainly known.

Reply to  Robert Kernodle
September 13, 2019 6:31 pm

The ruler is measuring lengths consistently. It’s just that the units are 0.96m rather than 1m.

Matthew R Marler
Reply to  Nick Stokes
September 13, 2019 9:43 pm

Nick Stokes: The ruler is measuring lengths consistently. It’s just that the units are 0.96m rather than 1m.

You are confounding bias and variance.

Reply to  Nick Stokes
September 14, 2019 2:48 am

There is no variance here. There is just one ruler, incorrectly graduated.

Matthew R Marler
Reply to  Nick Stokes
September 14, 2019 2:48 pm

Nick Stokes: There is just one ruler, incorrectly graduated

You do not know by how much.

Jean Parisot
September 13, 2019 2:40 pm

How much negative forcing from clouds do I have to use to get the models to show cooling?

Kneel
September 13, 2019 6:47 pm

“The similar behavior of the wide variety of different models with differing errors is proof of that. They all respond to increasing greenhouse gases, contrary to the claims of the paper”

I don’t believe that’s true, and here’s why:
the models don’t do a “prediction run” until they have had their free (although constrained) parameters tuned so that they produce a stable state during a control run. This means that they are “tuned” to “cancel out” their errors, leaving only CO2 as a “driver”. Which means that, within the tuning limits, they are all “similar”. Can’t be otherwise – they *must* show a change when a forcing (CO2) changes, and due to tuning and it’s constraints, this *must* be similar to the other models. There is also further post-hoc selection, where runs that “go off the rails” are discarded – like all “that’s odd” moments, such runs contain valuable information on the operation of the models that is, alas, currently being ignored.

In general, I suspect that the disagreement comes down to the difference between engineering and science – engineers want usefully predictive models with known errors so they can make things work – to them, it doesn’t matter if the model is “realistic” or not, in terms of it’s workings, only that it makes correct predictions.
Science, on the other hand, wants a model that explains the workings of the system accurately – having “correct” representations of the workings is more important than how accurate the prediction is.
Completely different goals that produce completely different interpretations of what is “useful”.
Both are valuable bits of information, for different reasons.
In short: “In theory [science], theoretically and practically are the same. In practice [engineering], they are different.”

Bill Haag
September 13, 2019 9:40 pm

There seems to be a misunderstanding afoot in the interpretation of the description of uncertainty in iterative climate models. I offer the following examples in the hopes that they clear up some of the mistaken notions apparently driving these erroneous interpretations.

Uncertainty: Describing uncertainty for human understanding is fraught with difficulties, evidence being the lavish casinos that persuade a significant fraction of the population that you can get something from nothing. There are many other examples, some clearer that others, but one successful description of uncertainty is that of the forecast of rain. We know that a 40% chance of rain does not mean it will rain everywhere 40% of the time, nor does it mean that it will rain all of the time in 40% of the places. We however intuitively understand the consequences of comparison of such a forecast with a 10% or a 90% chance of rain.

Iterative Models: Let’s assume we have a collection of historical daily high temperature data for a single location, and we wish to develop a model to predict the daily high temperature at that location on some date in the future. One of the simplest, yet effective, models that one can use to predict tomorrow’s high temperature is to use today’s high temperature. This is the simplest of models, but adequate for our discussion of model uncertainty. Note that at no time will we consider instrument issues such as accuracy, precision and resolution. For our purposes, those issues do not confound the discussion below.

We begin by predicting the high temperatures from the historical data from the day before. (The model is, after all, merely a single day offset) We then measure model uncertainty, beginning by calculating each deviation, or residual (observed minus predicted). From these residuals, we can calculate model adequacy statistics, and estimate the average historical uncertainty that exists in this model. Then, we can use that statistic to estimate the uncertainty in a single-day forward prediction.

Now, in order to predict tomorrow’s high temperature, we apply the model to today’s high temperature. From this, we have an “exact” predicted value ( today’s high temperature). However, we know from applying our model to historical data, that, while this prediction is numerically exact, the actual measured high temperature tomorrow will be a value that contains both deterministic and random components of climate. The above calculated model (in)adequacy statistic will be used to create an uncertainty range around this prediction of the future. So we have a range of ignorance around the prediction of tomorrow’s high temperature. At no time is this range an actual statement of the expected temperature. This range is similar to % chance of rain. It is a method to convey how well our model predicts based on historical data.

Now, in order to predict out two days, we use the “predicted” value for tomorrow (which we know is the same numerical value as today, but now containing uncertainty ) and apply our model to the uncertain predicted value for tomorrow. The uncertainty in the input for the second iteration of the model cannot be ‘canceled out’ before the number is used as input to the second application model. We are, therefore, somewhat ignorant of what the actual input temperature will be for the second round. And that second application of the model adds its ignorance factor to the uncertainty of the predicted value for two days out, lessening the utility of the prediction as an estimate of day-after-tomorrow’s high temperature. This repeats so that for predictions for several days out, our model is useless in predicting what the high temperature actually will be.

This goes on for each step, ever increasing the ignorance and lessening the utility of each successive prediction as an estimate of that day’s high temperature, due to the growing uncertainty.

This is an unfortunate consequence of the iterative nature of such models. The uncertainties accumulate. They are not biases, which are signal offsets. We do not know what the random error will be until we collect the actual data for that step, so we are uncertain of the value to use in that step when predicting.

Reply to  Bill Haag
September 13, 2019 10:02 pm

Really a nice explanation, Bill.

It captures the difference between error offset and uncertainty very well.

Also, I like the analogy to the uncertainty in predictions of rain. That’s something everyone can get.

The uncertainties are not errors. They’re expressions of ignorance. Exactly right.

Thanks. 🙂

John Q Public
Reply to  Pat Frank
September 14, 2019 9:57 am

I think we are at the point where climate modelers ought to sit back and allow some statisticians or at least physical scientists and engineers with statistical error analysis background come in and add to the discussion.

Mortimer Snerd
Reply to  John Q Public
September 14, 2019 11:58 am

John Q,
Agreed. And Haag is a statistician .

Matthew R Marler
September 13, 2019 9:46 pm

Windchaser: Now imagine you added an extra 100 people, basically as a copy of the first 100 people. Person #101 has the same $ as person #1. Person #102 has the same $ as person #2. And so on. A copy of the people and their $.

You are assuming that the individuals are known to have the same, and that the value is known. With propagation of error, neither is known, so the variance of the total increases.

angech
September 14, 2019 1:38 am

Windchaser
“Note that his results come out completely differently if you change the length of the timesteps to, say, months, and still add this uncertainty in at each timestep. (Because then the equation would add in 4W/m2/month, not 4 W/m2/year).”

Windchaser this is something that you raise now and that Roy Spencer mentioned last article [wrongly, sorry Roy] and that Nick Stokes repeatedly uses.

There are 2 different ways to approach the 4W/m-2.
Remembering that a plain 4 W/m-2 simply means a unit of energy over time even though the word time does not appear in the subscript it is implicit and explicit in the term Watts.
The unit of power (the watt) in SI, the International System of Units. 1 watt = 1 joule per second = 1 newton meter per second = 1 kg m2 s-3.

The “watt per square meter” is the SI unit for radiative and other energy fluxes in geophysics. The “solar constant”, the solar irradiance at the mean earth-sun distance is approximately 1370 W m-2. The global and annual average solar irradiance at the top of the earth’s atmosphere is one-fourth of the solar constant or about 343 W m-2.

The first step of Nicks is to use either, both quite different definitions, when he is arguing and then swaps them in place of each other.

Hence the “solar constant” is repeatedly referred to as a timeless unit by both he and ATTP when he supports Nick’s argument.
Note that the solar constant does have a time dimension to it and an area dimension despite numerous denials by people who should know better. A flux is not just a number without units. Secondly it is not actually a constant, it varies as the distance from the sun.

So the timestep is 1 second, always.

Pat Frank uses the 4W/m-2. iin his paper in a totally different way. He is discussing the
CMIP5 Model Calibration Error in Global Average Annual Total Cloud Fraction (TCF).
The words Global Average Annual mean just that. CMIP5 , not Frank
He and they are comparing annual, not monthly, not per second factors and giving the calibration error as a percentage of the average annual rate per second.
Very important.

Three things spring to mind.
You say,
“his results come out completely differently if you change the length of the timesteps to, say, months, and still add this uncertainty in at each timestep.”
You ignore the fact that when you change the timestep you have to change the amount of energy received in that time step by the same factor, The total amount of energy in 1 month is 1/12th of that in a year = 1/3 of a W/m-2/month.

Further your contention [“and still add this uncertainty in at each timestep”] is wrong. In that you now use a compounding rate for different times lengths and assume that the result should be the same .Roy made the same compounding comment and inbuilt error.
If you use a month as your basis you have to assess monthly. You have to add each months energy received together for 12 months, not compound them. The sun does not keep increasing heat each month. The earth is no hotter after 12 individual months in a row than the same year. This is one of the basic ways of avoiding error. Check that the end point agrees with the assumptions that your calculations give.

Thirdly that Nick Stokes should be called out each and every time that he purposely swaps terms. This has been quite difficult for years as he tends to parse his arguments, deliberately, into one part which is quite true and one that is ever so subtly wrong.
A subtle reminder is him attacking Pat Frank for using mean root squares .
This is a standard way of assessing the size of an error.
Nick and others still have comments up claiming that since a root mean square is always positive that Frank cannot use the figure as a plus or minus for the standard deviation in uncertainty.
This is despite this being the typical way that uncertainty is quantified. A deviation plus or minus of the mean root square.
Rant finished.

Reply to  angech
September 15, 2019 9:47 pm

Windchaser is wrong.

The propagated uncertainty does not change with time step because the magnitude of the LWCF error varies with the time step.

I pointed this out to Roy, and I’m pointing it out here. It’s an obvious point.

One of the reviewers asked about it. I showed the following calculation, which satisfied the concern.

Let’s first approach the shorter time interval. Suppose a one-day (86,400 seconds) GCM time step is assumed. The ±4 Wm^–2 calibration error is a root-mean-square annual mean of 27 models across 20 years. Call that a rms average of 540 model-years.

The model cloud simulation error across one day will be much smaller than the average simulation error across one year, because the change in both simulated and real global average cloud cover will be small over short times.

We can estimate the average per-day calibration error as (±4 Wm^–2)^2 = [sum over 365 days(e_i )^2], where e_i is the per-day error. Working through this, e_i = ±0.21 Wm^–2. If we put that into the right side of equation 5.2 and set F_0=33.30 Wm^-2, then the one-day per-step uncertainty is ±0.087 C. The total uncertainty after 100 years is sqrt[(0.087)^2*365*100] = ±16.6 C.

Likewise, the estimated 25-year mean model calibration uncertainty is sqrt(16*25) = ±20 Wm^-2. Following from eqn. 5.2, the 25-year per-step uncertainty is ±8.3 C. After 100 years the uncertainty is sqrt[(8.3)^2*4)] = ±16.6 C.

These are average uncertainties following from the 540 simulation years and the assumption of a linearly uniform error across the average simulation year. Individual models will vary.

Unfortunately, neither the observational resolution nor the model resolution is able to provide a per-day simulation error. However, the 25-year mean is relatively indicative because the time-frame is only modestly extended beyond the 20-year mean uncertainty calculated in Lauer and Hamilton, 2013.

Reply to  angech
September 15, 2019 9:57 pm

You’re exactly right, angech.

I’ll be adding a long comment at the bottom of the post thread about how uncertainty is propagated and the meaning of root-mean-square, all taken from papers about engineering and model errors.

All the relevant WUWT and Roy Spencer blog posts will get their very own copy.

Those papers fully support the way I did the analysis.

kribaez
September 14, 2019 5:09 am

It is obvious that many commenters here think that the “LW cloud forcing” is a forcing. Despite its misleading name, it is not. It forms part of the pre-run net flux balance.

Dr Frank wrote ““LWCF [longwave cloud forcing] calibration error is +/- 144 x larger than the annual average increase in GHG forcing. This fact alone makes any possible global effect of anthropogenic CO2 emissions invisible to present climate models.”

Dr Spencer replied above:- “While I agree with the first sentence, I thoroughly disagree with the second. Together, they represent a non sequitur. ”

Pat Frank implies in the above statement that the LWCF is a forcing. It is not. In his uncertainty estimation, he further assumes that any and all flux errors in LWCF can be translated into an uncertainty in forcing in his emulator. No, it cannot.

Forcings – such as those used in Dr Franks’s emulator – are exogenously imposed changes to the net TOA flux, and can be thought of essentially as deterministic inputs. The cumulative forcing (which is what Dr Frank uses to predict temperature change in his emulator) is unambiguously applied to a system in net flux balance. The LWCF variable is a different animal. It is one of the multiple components in the net flux balance, and it varies in magnitude over time as other state-variables change, in particular as the temperature field changes.

They have the same dimensions, but they are not similar in their effect.

If I change a controlling parameter to introduce a +4 W/m^2 downward change in LWCF at TOA at the start of the 500 year spin-up period in any AOGCM, the effect on subsequent incremetal temperature projections is small, bounded and may, indeed, be negligible. If, on the other hand I introduce an additional 4 W/m^2 to the forcing series at the start of a run, then it will add typically about 3 deg C to the incremental temperature projection over any extended period.
The reason is that, during the spin-up period, the model will be brought into net flux balance. This is not achieved by “tweaking” or “intervention”. It happens because the governing equations of the AOGCM recognise that heating is controlled by net flux imbalance. If there is a positive/negative imbalance in net TOA flux at the aggregate level then the planet warms/cools until it is brought back into balance by restorative fluxes, most notably Planck. My hypothetical change of +4 W/m^2 in LWCF at the start of the spin-up period (with no other changes to the system) would cause the absolute temperature to rise by about 3 deg C relative to its previous base. Once forcings are introduced for the run (i.e. after this spin-up period), the projected temperature gain will be expressed relative to this revised base and will be affected only by any change in sensitivity arising. It is important to note that even if such sensitivty change were visible, Dr Frank has no way to mimic any uncertainty propagation via a changing sensitivity. It would correspond to a change in his fixed gradient which relates temperature change to cumulative net flux, but he has no degree of freedom to change this.

None of the above should be interpreted to mean that it is OK to have errors in the internal energy of the system. It is only to emphasise that such errors and particularly systemic errors can not be treated as adjustments or uncertainties in the forcing.

Reply to  kribaez
September 15, 2019 9:24 pm

I make it very clear, kribaez, that LWCF is part of the tropospheric thermal energy flux. Forcing is not my invented term. It’s the term of use.

The ±4 W/m^2 model LWCF calibration error is not a forcing. It’s an uncertainty in the CMIP5 simulation of forcing. It says nothing at all about TOA balance, either in the real climate or in a simulated climate.

You wrote, “ errors and particularly systemic errors can not be treated as adjustments or uncertainties in the forcing.

Adjustments, no. Uncertainties, yes.

Uncertainty isn’t error, kribaez. Uncertainty is a knowledge deficit. The ±4W/m^2 LWCF error means one does not know the physically correct intensity of simulated LWCF to better than ±4 W/m^2.

Whatever the imbalance caused by, e.g., a 0.035 W/m^2 annual perturbation by CO2 forcing is lost within that uncertainty.

The model will produce a simulation response to the addition of CO2 forcing, but that simulated response will be physically meaningless.

angech
September 14, 2019 9:44 pm

and Then There’s Physics made an interesting comment on his site
I append it and a comment I made
“The problem is that the intrinsic variability means that the range could represent the range of actually physically plausible pathways, rather than simply representing an uncertainty around the actual pathway. If we have a perfect model, and chaos was not a problem, then maybe we could determine the actual pathway, but we probably can’t. Hence, we shouldn’t necessarily expect the ensemble mean to represent the best estimate of reality.”
Thanks for this refreshing bit of reality.
Also I see.
“James Annan says the following view is implausible, because we can never know the “truth”:

This sums up the Pat Frank controversy.
He is pointing out that the way we do things still has lots of potential errors in them.
This means that there is a small chance that the arguments for AGW may be wrong.
Shooting the messenger is not the right response.
Improving the understanding as Paul suggests is the correct way to go.
People should be thanking him for raising the issue and addressing his concerns on their merits. Then doing the work to address the concerns.

My worry is that if he is correct the models have a lot more self regulating in them addressed to TOA than they should which in turn makes them run warm.

Mark
September 15, 2019 6:38 am

This discussion is really appalling and disappointing. It is clear that many if not most commenters as well as Dr. Spencer do not understand the uncertainty term. They don’t understand where it comes from, how it is calculated, or what it means. The uncertainty calculation is a completely separate calculation from the model output calculation. This is basic science. This is freshman engineering (first semester, even). If Pat Frank’s calculated uncertainty is correct then the models are useless for making predictions. End of story. Full stop. Whenever I have seen climatic temperature predictions I have never seen associated error bars or uncertainties. It is clear that climate scientists and modelers are simply not doing this critical work. They’ve never even heard of it apparently.

If you suspect your three year old child has a fever would you use a thermometer with a calibrated uncertainty of +/-5 degrees? If the thermometer indicated 98.6 degrees F (sorry, American here), which is normal, but the child is lethargic and feels hot, do you believe this thermometer? Or would you find a thermometer with a calibrated uncertainty of +/- 0.1 degrees F?

My example is simply the output of one single instrument. In a complicated model with many terms, each with it’s own uncertainty, the collective and combined uncertainty in the final output must be accounted for. You simply can’t produce a clean simple output of +3 degrees C from a model full of terms with their own unique uncertainties. The final output must also be uncertain, given that the inputs were uncertain. All Dr Frank has done is to calculate a final uncertainty for the models.

Dave Day
Reply to  Mark
September 15, 2019 8:15 am

+1

Especially the “This discussion is really appalling and disappointing. It is clear that many if not most commenters as well as Dr. Spencer do not understand the uncertainty term.”

I apparently had really good teachers in what we called back in the 60s “accelerated” science classes. We were not allowed to submit work without a graphical representation and discussion of error propagation.
No “A”s were given without correct error bars.

You cannot with a straight face call yourself a scientist if you don’t understand and use these constraints on what you can claim to know.

I personally think WUWT has been too complacent about this and as a community we should insist on discussions of error propagation in all posts involving calculations. Lets start setting an example in addition to offering constructive criticisms.

I am just a lay person with a good basic set of foundations. For my part, I yesterday ordered several
books on the subject from Amazon.

Dave Day

Reply to  Mark
September 15, 2019 9:12 pm

Thank-you, Mark.

You’re right. I just focused on end-state uncertainty produced by a lower limit of error.

Your point about parameter uncertainties also appears in the paper. They’re never, ever propagated through a projection.

Instead, what we get, are “perturbed physics” tests, in which parameters are varied across their uncertainties in many model runs. Then, the variability of the runs around their mean is presented as the projection uncertainty.

I’ve had climate modelers call that exercise, propagation of error. Really incredible.

I’ll be posting a long comment below, presenting the method of error propagation and the concept of uncertainty from a variety of papers in engineering journals. I suspect you know them all.

You may not know of Vasquez and Whiting, that looks at uncertainty estimation in the context of non-linear models. I found that paper particularly valuable.

Reply to  Pat Frank
September 16, 2019 1:49 pm

Agreed, Dr. Frank. Your pointing this out was very important to me. Since I had also noticed the same lack that you, and others, noticed. I think I’ve even commented to that effect years ago.

September 15, 2019 10:21 pm

For the benefit of all, I’ve put together an extensive post that provides quotes, citations, and URLs for a variety of papers — mostly from engineering journals, but I do encourage everyone to closely examine Vasquez and Whiting — that discuss error analysis, the meaning of uncertainty, uncertainty analysis, and the mathematics of uncertainty propagation.

These papers utterly support the error analysis in “Propagation of Error and the Reliability of Global Air Temperature Projections.”

Summarizing: Uncertainty is a measure of ignorance. It is derived from calibration experiments.

Multiple uncertainties propagate as root sum square. Root-sum-square has positive and negative roots (+/-). Never anything else, unless one wants to consider the uncertainty absolute value.

Uncertainty is an ignorance width. It is not an energy. It does not affect energy balance. It has no influence on TOA energy or any other magnitude in a simulation, or any part of a simulation, period.

Uncertainty does not imply that models should vary from run to run, Nor does it imply inter-model variation. Nor does it necessitate lack of TOA balance in a climate model.

For those who are scientists and who insist that uncertainty is an energy and influences model behavior (none of you will be engineers), or that a (+/-)uncertainty is a constant offset, I wish you a lot of good luck because you’ll not get anywhere.

For the deep-thinking numerical modelers who think rmse = constant offset or is a correlation: you’re wrong.

The literature follows:

Moffat RJ. Contributions to the Theory of Single-Sample Uncertainty Analysis. Journal of Fluids Engineering. 1982;104(2):250-8.

Uncertainty Analysis is the prediction of the uncertainty interval which should be associated with an experimental result, based on observations of the scatter in the raw data used in calculating the result.

Real processes are affected by more variables than the experimenters wish to acknowledge. A general representation is given in equation (1), which shows a result, R, as a function of a long list of real variables. Some of these are under the direct control of the experimenter, some are under indirect control, some are observed but not controlled, and some are not even observed.

R=R(x_1,x_2,x_3,x_4,x_5,x_6, . . . ,x_N)

It should be apparent by now that the uncertainty in a measurement has no single value which is appropriate for all uses. The uncertainty in a measured result can take on many different values, depending on what terms are included. Each different value corresponds to a different replication level, and each would be appropriate for describing the uncertainty associated with some particular measurement sequence.

The Basic Mathematical Forms

The uncertainty estimates, dx_i or dx_i/x_i in this presentation, are based, not upon the present single-sample data set, but upon a previous series of observations (perhaps as many as 30 independent readings) … In a wide-ranging experiment, these uncertainties must be examined over the whole range, to guard against singular behavior at some points.

Absolute Uncertainty

x_i = (x_i)_avg (+/-)dx_i

Relative Uncertainty

x_i = (x_i)_avg (+/-)dx_i/x_i

Uncertainty intervals throughout are calculated as (+/-)sqrt[(sum over (error)^2].

The uncertainty analysis allows the researcher to anticipate the scatter in the experiment, at different replication levels, based on present understanding of the system.

The calculated value dR_0 represents the minimum uncertainty in R which could be obtained. If the process were entirely steady, the results of repeated trials would lie within (+/-)dR_0 of their mean …”

Nth Order Uncertainty

The calculated value of dR_N, the Nth order uncertainty, estimates the scatter in R which could be expected with the apparatus at hand if, for each observation, every instrument were exchanged for another unit of the same type. This estimates the effect upon R of the (unknown) calibration of each instrument, in addition to the first-order component. The Nth order calculations allow studies from one experiment to be compared with those from another ostensibly similar one, or with “true” values.

Here replace, “instrument” with ‘climate model.’ The relevance is immediately obvious. An Nth order GCM calibration experiment averages the expected uncertainty from N models and allows comparison of the results of one model run with another in the sense that the reliability of their predictions can be evaluated against the general dR_N.

Continuing: “The Nth order uncertainty calculation must be used wherever the absolute accuracy of the experiment is to be discussed. First order will suffice to describe scatter on repeated trials, and will help in developing an experiment, but Nth order must be invoked whenever one experiment is to be compared with another, with computation, analysis, or with the “truth.”

Nth order uncertainty, “

*Includes instrument calibration uncertainty, as well as unsteadiness and interpolation.
*Useful for reporting results and assessing the significance of differences between results from different experiment and between computation and experiment.

The basic combinatorial equation is the Root-Sum-Square:

dR = sqrt[sum over((dR_i/dx_i)*dx_i)^2]

https://doi.org/10.1115/1.3241818

Moffat RJ. Describing the uncertainties in experimental results. Experimental Thermal and Fluid Science. 1988;1(1):3-17.

The error in a measurement is usually defined as the difference between its true value and the measured value. … The term “uncertainty” is used to refer to “a possible value that an error may have.” … The term “uncertainty analysis” refers to the process of estimating how great an effect the uncertainties in the individual measurements have on the calculated result.

THE BASIC MATHEMATICS

This section introduces the root-sum-square (RSS) combination (my bold), the basic form used for combining uncertainty contributions in both single-sample and multiple-sample analyses. In this section, the term dX_i refers to the uncertainty in X_i in a general and nonspecific way: whatever is being dealt with at the moment (for example, fixed errors, random errors, or uncertainties).

Describing One Variable

Consider a variable X_i, which has a known uncertainty dX_i. The form for representing this variable and its uncertainty is

X=X_i(measured) (+/-)dX_i (20:1)

This statement should be interpreted to mean the following:
* The best estimate of X, is X_i (measured)
* There is an uncertainty in X_i that may be as large as (+/-)dX_i
* The odds are 20 to 1 against the uncertainty of X_i being larger than (+/-)dX_i.

The value of dX_i represents 2-sigma for a single-sample analysis, where sigma is the standard deviation of the population of possible measurements from which the single sample X_i was taken.

The uncertainty (+/-)dX_i Moffat described, exactly represents the (+/-)4W/m^2 LWCF calibration error statistic derived from the combined individual model errors in the test simulations of 27 CMIP5 climate models.

For multiple-sample experiments, dX_i can have three meanings. It may represent tS_(N)/(sqrtN) for random error components, where S_(N) is the standard deviation of the set of N observations used to calculate the mean value (X_i)_bar and t is the Student’s t-statistic appropriate for the number of samples N and the confidence level desired. It may represent the bias limit for fixed errors (this interpretation implicitly requires that the bias limit be estimated at 20:1 odds). Finally, dX_i may represent U_95, the overall uncertainty in X_i.

From the “basic mathematics” section above, the over-all uncertainty U = root-sum-square = sqrt[sum over((+/-)dX_i)^2] = the root-sum-square of errors (rmse). That is U = sqrt[(sum over(+/-)dX_i)^2] = (+/-)rmse.

The result R of the experiment is assumed to be calculated from a set of measurements using a data interpretation program (by hand or by computer) represented by

R = R(X_1,X_2,X_3,…, X_N)

The objective is to express the uncertainty in the calculated result at the same odds as were used in estimating the uncertainties in the measurements.

The effect of the uncertainty in a single measurement on the calculated result, if only that one measurement were in error would be

dR_x_i = (dR/dX_i)*dX_i)

When several independent variables are used in the function R, the individual terms are combined by a root-sum-square method.

dR = sqrt[sum over(dR/dX_i)*dX_i)^2]

This is the basic equation of uncertainty analysis. Each term represents the contribution made by the uncertainty in one variable, dX_i, to the overall uncertainty in the result, dR.

http://www.sciencedirect.com/science/article/pii/089417778890043X

Vasquez VR, Whiting WB. Accounting for Both Random Errors and Systematic Errors in Uncertainty Propagation Analysis of Computer Models Involving Experimental Measurements with Monte Carlo Methods. Risk Analysis. 2006;25(6):1669-81.

[S]ystematic errors are associated with calibration bias in the methods and equipment used to obtain the properties. Experimentalists have paid significant attention to the effect of random errors on uncertainty propagation in chemical and physical property estimation. However, even though the concept of systematic error is clear, there is a surprising paucity of methodologies to deal with the propagation analysis of systematic errors. The effect of the latter can be more significant than usually expected.

Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources.

Of particular interest are the effects of possible calibration errors in experimental measurements. The results are analyzed through the use of cumulative probability distributions (cdf) for the output variables of the model.”

A good general definition of systematic uncertainty is the difference between the observed mean and the true value.”

Also, when dealing with systematic errors we found from experimental evidence that in most of the cases it is not practical to define constant bias backgrounds. As noted by Vasquez and Whiting (1998) in the analysis of thermodynamic data, the systematic errors detected are not constant and tend to be a function of the magnitude of the variables measured.”

Additionally, random errors can cause other types of bias effects on output variables of computer models. For example, Faber et al. (1995a, 1995b) pointed out that random errors produce skewed distributions of estimated quantities in nonlinear models. Only for linear transformation of the data will the random errors cancel out.”

Although the mean of the cdf for the random errors is a good estimate for the unknown true value of the output variable from the probabilistic standpoint, this is not the case for the cdf obtained for the systematic effects, where any value on that distribution can be the unknown true. The knowledge of the cdf width in the case of systematic errors becomes very important for decision making (even more so than for the case of random error effects) because of the difficulty in estimating which is the unknown true output value. (emphasisi in original)”

It is important to note that when dealing with nonlinear models, equations such as Equation (2) will not estimate appropriately the effect of combined errors because of the nonlinear transformations performed by the model.

Equation (2) is the standard uncertainty propagation sqrt[sum over(±sys error statistic)^2].

In principle, under well-designed experiments, with appropriate measurement techniques, one can expect that the mean reported for a given experimental condition corresponds truly to the physical mean of such condition, but unfortunately this is not the case under the presence of unaccounted systematic errors.

When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i. Similarly, the same approach is used to define a total random error based on individual standard deviation estimates,

e_k = sqrt[sum over(sigma_R_i)^2]

A similar approach for including both random and bias errors in one fterm is presented by Deitrich (1991) with minor variations, from a conceptual standpoint, from the one presented by ANSI/ASME (1998)

http://dx.doi.org/10.1111/j.1539-6924.2005.00704.x

Kline SJ. The Purposes of Uncertainty Analysis. Journal of Fluids Engineering. 1985;107(2):153-60.

The Concept of Uncertainty

Since no measurement is perfectly accurate, means for describing inaccuracies are needed. It is now generally agreed that the appropriate concept for expressing inaccuracies is an “uncertainty” and that the value should be provided by an “uncertainty analysis.”

An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable.

The term “calibration experiment” is used in this paper to denote an experiment which: (i) calibrates an instrument or a thermophysical property against established standards; (ii) measures the desired output directly as a measurand so that propagation of uncertainty is unnecessary.

The information transmitted from calibration experiments into a complete engineering experiment on engineering systems or a record experiment on engineering research needs to be in a form that can be used in appropriate propagation processes (my bold). … Uncertainty analysis is the sine qua non for record experiments and for systematic reduction of errors in experimental work.

Uncertainty analysis is … an additional powerful cross-check and procedure for ensuring that requisite accuracy is actually obtained with minimum cost and time.

Propagation of Uncertainties Into Results

In calibration experiments, one measures the desired result directly. No problem of propagation of uncertainty then arises; we have the desired results in hand once we complete measurements. In nearly all other experiments, it is necessary to compute the uncertainty in the results from the estimates of uncertainty in the measurands. This computation process is called “propagation of uncertainty.”

Let R be a result computed from n measurands x_1, … x_n„ and W denotes an uncertainty with the subscript indicating the variable. Then, in dimensional form, we obtain: (W_R = sqrt[sum over(error_i)^2]).”

https://doi.org/10.1115/1.3242449

Henrion M, Fischhoff B. Assessing uncertainty in physical constants. American Journal of Physics. 1986;54(9):791-8.

“Error” is the actual difference between a measurement and the value of the quantity it is intended to measure, and is generally unknown at the time of measurement. “Uncertainty” is a scientist’s assessment of the probably magnitude of that error.

https://aapt.scitation.org/doi/abs/10.1119/1.14447

Chris Thompson
September 16, 2019 7:15 am

Bill Haag’s example is very clever, and rings true.

However, let’s think about the same model a little differently.

Let’s say our dataset of thousands of days shows the hottest ever day was 34 degrees C and the lowest 5 degrees C. The mean is 20 degrees C, with a standard deviation of +/- 6 degrees C.

Let’s say today is 20 degrees C. Tomorrow is likely closer to 20 than 34. The standard deviation tells us that 19 out of 20 times, tomorrow’s temperature will range between 14 and 26 degrees.

But is this the correct statistic to predict tomorrow’s temperature, given today’s?

Actually, that statistic is a little different. A better statistic would be the uncertainty of the change in temperature from one day to the next.

So let’s say we go back to the dataset and find that 19 out of 20 days are likely to be within +/- 5 degrees C of the day before.

Is this a more helpful statistic? When today’s temperature is in the middle of the range, +/- 5 degrees C sounds fair and reasonable. But what if today’s temperature was 33 degrees C, does +/- 5 degrees C still sound fair and reasonable – given that it’s never exceeded 34 degrees C, in the entire dataset?

It’s clear that the true uncertainty distribution for what happens after very hot, or very cold days is that the next day is more likely to be closer to the average, than cooler than warmer.

To properly calculate the uncertainty bounds for a question like this one has to get the uncertainty bounds for each starting point, and then we find that like beams of a torch, they all point ahead, but towards the middle. And overall, it is never possible for the uncertainty to exceed that of the entire dataset, no matter what the previous day’s temperature. The uncertainty in prediction does not get bigger if we look 4 days or 40 or 400 days ahead. The outer bounds of the true uncertainty range is restricted by the limitations of the actual possible range of temperatures that are possible. The rate of change of those limits is orders of magnitude greater than that predicted by extrapolation of the accuracy of a single day to day uncertainty.

To flesh this out, let’s try compounding uncertainties in the light of this dataset. Let’s say that we know our uncertainty is +/5 degrees, on average, starting at 20 degrees C. Is the uncertainty range for a prediction 2 days ahead +/- 10 degrees? If we went out 10 days, does the uncertainty grow to +/- 50 degrees Centigrade? Plainly not.

We can’t just keep adding uncertainties like that, because should a day actually get 5 degrees hotter, two days in a row, it gets close the record maximum for our location, and it has great difficult getting a lot hotter than that, whereas it is very much more likely to get cooler.

Statistically, random unconstrained uncertainties are, as Dr. Frank has pointed out, added by the the square root of the sample count. After four days, our uncertainty would double to 10 degrees C, and after 16 days, double again to 20 degrees C. After 64 days, the extrapolated uncertainty range becomes an impossible +/- 40 degrees C.

Since such a range is truly impossible, there must be something wrong with our uncertainty calculation… and there is.

The mistake was to extrapolate a daily uncertainty, ad infinitum, on an inverse square method. Exactly the method used by Dr. Frank. It is wrong in this setting and it was wrong in his.

The simple fact is that the uncertainty range for any given day cannot exceed +/- 6 degrees C, the standard deviation of the dataset, no matter how far out we push our projection. It wouldn’t matter much what the temperature of the start day was, future uncertainty doesn’t get any greater than that.

An analysis of this kind shows us that that measures of uncertainty cannot not be compounded infinitely – at least, in systems of limited absolute uncertainty.

Dr. Frank’s paper is based entirely on projecting yearly uncertainties out into the future. Unfortunately he is misusing the statistics he professes to understand so well.

That the various models have not spread and distributed themselves more widely despite having been run for some time indicates that the uncertainties predicted are excessively wide. Of course, in time, we will find the answer; Dr. Frank and I would agree that the test of an uncertainty estimate is resolved by how well the dataset ends up matching the uncertainty bounds. If the models stay well inside those boundaries and do not come close to its borders, then we will know that those uncertainty bounds were incorrect – and vice versa.

I am a practising researcher and I do understand the underlying statistical methods.

Bill Haag
Reply to  Chris Thompson
September 16, 2019 8:14 am

Chris,

Thank you for the kind words in the first sentence.

However you are not “thinking about the same model a little differently”, you are changing the model. So everything after is not relevant to my points. Perhaps to other points, but not to my example of the projection of uncertainty, which was my point.

Once again, the model was to use the prior day’s high temperature to predict each day’s high temperature. The total range of the data over how ever many days of data you have is irrelevant for this model. From the historical data, a set of residuals are calculated for each observed-minus-predicted pair. These residuals are the ‘error’ in each historical prediction. The residuals are then used to calculate a historical model-goodness statistic (unspecified here to avoid other disagreements posted on the specifics of such calculations)

This model is then used going forward. See the earlier post for details, but it is the uncertainty not the error that is propagated. The model estimate for the second day out from today is forced to use the uncertain estimated value from the model of the first day out, while contributing its own uncertainty to its prediction. And so it goes forward.

Bill

Bill Haag
Reply to  Bill Haag
September 16, 2019 8:24 am

Chris,

You also are confusing uncertainty with error. The uncertainty is a quantity that describes the ignorance of a predicted value. Like the 40% chance of rain, it is not a description of physical reality, or physical future. It doesn’t rain 40% of the time everywhere, nor does it rain all the time in 40% of the places. But the uncertainty of rainfall is communicated without our believing that one of the two physical realities is being predicted.

Bill

1sky1
Reply to  Bill Haag
September 16, 2019 4:18 pm

[I]t is the uncertainty not the error that is propagated. The model estimate for the second day out from today is forced to use the uncertain estimated value from the model of the first day out, while contributing its own uncertainty to its prediction. And so it goes forward.

While such iterative propagation of purely PROBABILISTIC uncertainty may be applicable to some problems and their models, the signal analysis conception of “prediction” necessarily entails unequivocal specification of a particular value of the variable of interest. That is the nature of predictions provided by Wiener or Kalman filters that are used in sophisticated modeling of geophysical time-series. In that context, model error is far more telling than some a priori academic specification of “uncertainty.”

GCMs, on the other hand, are a mish-mash of attempts at DETERMINISTIC physical modeling, with diverse simplifications and/or parametrizations substituting for genuine physics in treating key factors, such as cloud effects. What has not been been adequately established here is how the different GCMs actually treat the posited ±4 W/m^2 uncertainty in the “cloud forcing.” Is that “forcing” simply held constant, as is relative humidity in most GCMs, or is it computed recursively at each time-step–and if so, how? Until this basic question is resolved, the applicability of the simple propagation model given here to the actual problem at hand will remain itself uncertain.

Reply to  1sky1
September 16, 2019 5:07 pm

“simply held constant, as is relative humidity in most GCMs”
Relative humidity is not held constant in any GCMs. They couldn’t do it even if they wanted. Water is conserved.

1sky1
Reply to  1sky1
September 17, 2019 3:11 pm

I had in mind that RH remains fixed in GCMs on a global scale. As Isaac Held points out (https://www.gfdl.noaa.gov/blog_held/31-relative-humidity-in-gcms/):

In the first (albeit rather idealized) GCM simulation of the response of climate to an increase in CO2…[Manabe and Wetherald] found, in 1975, that water vapor did increase throughout the model troposphere at roughly the rate needed to maintain fixed RH.

You’re correct, however, that RH simulations vary spatio-temporally. In reference to his model animations, Held states:

These animations can be useful in discussions with those unfamiliar with GCMs, who might mistakenly think that RH is fixed by fiat in these models. The result that RH distributions remain more or less unchanged in warmer climates is an emergent property of these models.

But no one has answered my essential question of how the “cloud forcing” at issue here is handled by various models. I suspect that it’s treated differently by different models, with highly non-uniform consequences upon model uncertainty.

Mark
Reply to  Chris Thompson
September 16, 2019 2:13 pm

Wrong. The model outputs have nothing to do with the uncertainty Pat Frank has calculated. Uncertainty has a very specific meaning here. It is well-defined and it is basic science. Whether it is a climate model, some other kind of model, or a calculated temperature for a turbine engine, if the inputs have uncertainties then the output must also. And the inputs always have uncertainties. And therefore the output must also have an uncertainty, yet climate modelers neither calculate it nor state it.

The calculation of the uncertainty for the final output has nothing to do with the output of the model itself. The uncertainty calculation is a separate calculation. In my simple example above, a calibrated thermometer, you can take the temperature of 50, 100, 1000, or 10,000 people and the uncertainty will always be +/- 5 degrees for that thermometer. Period. The uncertainty of this thermometer does not improve the more you use it. You could then analyze this set of 10,000 temperature data points and calculate the standard deviation of this set if you wanted to. But that would be irrelevant to the uncertainty of the thermometer which would remain +/- 5 degrees. The only way you can reduce this device-specific uncertainty is to get a new thermometer with a tighter uncertainty. But it will have an uncertainty, too.

If this thermometer is used in an experiment, along with a scale and a ruler, each device has an associated uncertainty. If we get readings from our thermometer, scale, and ruler and these readings are used in a calculation, the calculation’s final answer must also have an uncertainty because the inputs were uncertain. This final uncertainty is derived from the individual uncertainties of the elements of the end calculation and not the end calculation itself. It is a separate analysis and it only considers the individual uncertainty of the various elements that went intro the calculation. It is agnostic to the output of our calculation. It is separate and distinct from it.

If I model the internal temperature of a new clean-sheet turbine engine design and at one point it reaches 700 degrees F that’s useful information, if the model is correct. But what if the uncertainty of our calculated (modeled) temperature is +/- 3000 degrees F? 700 degrees F +/- 3000 degrees F could easily put us in a place where things start to melt. This is not a useful model.

And as Pat Frank has demonstrated, neither are these climate models, based solely on the output uncertainty. And while there may be other reasons to reject this crop of climate models, as has Roy Spencer, there is really no reason to go further than the uncertainty analysis that Frank has performed. These models are like a hurricane cone spanning 180 degrees. That only tells us it’s going somewhere, which we already knew.

Reply to  Chris Thompson
September 16, 2019 10:14 pm

Your discussion is wrong Chris Thompson, because you’re assigning physical meaning to an uncertainty.

Your mistake becomes very clear when you write that, “could the uncertainties add up to +/- 50 degrees Centigrade? Plainly not. We can’t just keep adding uncertainties like that, because should a day actually get hot two days in a row, it has great difficult getting a lot hotter, and becomes more likely to get cooler.”

That (+/-)50 C says nothing about what the temperature could actually be. It’s an estimate of what you actually know about the temperature 10 days hence. Namely, nothing.

That’s all it means. It’s a statement of your ignorance, not of temperature likelihood.

You’re a practicing researcher, Chris Thompson, but you do not understand the meaning of the statistics you use.

September 16, 2019 10:11 pm

This illustration might clarify the meaning of (+/-)4 W/m^2 of uncertainty in annual average LWCF.

The question to be addressed is what accuracy is necessary in simulated cloud fraction to resolve the annual impact of CO2 forcing?

We know from Lauer and Hamilton that the average CMIP5 (+/-)12.1% annual cloud fraction (CF) error produces an annual average (+/-)4 W/m^2 error in long wave cloud forcing (LWCF).

We also know that the annual average increase in CO2 forcing is about 0.035 W/m^2.

Assuming a linear relationship between cloud fraction error and LWCF error, the (+/-)12.1% CF error is proportionately responsible for (+/-)4 W/m^2 annual average LWCF error.

Then one can estimate the level of resolution necessary to reveal the annual average cloud fraction response to CO2 forcing as, (0.035 W/m^2/(+/-)4 W/m^2)*(+/-)12.1% cloud fraction = 0.11% change in cloud fraction.

This indicates that a climate model needs to be able to accurately simulate a 0.11% feedback response in cloud fraction to resolve the annual impact of CO2 emissions on the climate.

That is, the cloud feedback to a 0.035 W/m^2 annual CO2 forcing needs to be known, and able to be simulated, to a resolution of 0.11% in CF in order to know how clouds respond to annual CO2 forcing.

Alternatively, we know the total tropospheric cloud feedback effect is about -25 W/m^2. This is the cumulative influence of 67% global cloud fraction.

The annual tropospheric CO2 forcing is, again, about 0.035 W/m^2. The CF equivalent that produces this feedback energy flux is again linearly estimated as (0.035 W/m^2/25 W/m^2)*67% = 0.094%.

Assuming the linear relations are reasonable, both methods indicate that the model resolution needed to accurately simulate the annual cloud feedback response of the climate, to an annual 0.035 W/m^2 of CO2 forcing, is about 0.1% CF.

To achieve that level of resolution, the model must accurately simulate cloud type, cloud distribution and cloud height, as well as precipitation and tropical thunderstorms.

This analysis illustrates the meaning of the (+/-)4 W/m^2 LWCF error. That error indicates the overall level of ignorance concerning cloud response and feedback.

The CF ignorance is such that tropospheric thermal energy flux is never known to better than (+/-)4 W/m^2. This is true whether forcing from CO2 emissions is present or not.

GCMs cannot simulate cloud response to 0.1% accuracy. It is not possible to simulate how clouds will respond to CO2 forcing.

It is therefore not possible to simulate the effect of CO2 emissions, if any, on air temperature.

As the model steps through the projection, our knowledge of the consequent global CF steadily diminishes because a GCM cannot simulate the global cloud response to CO2 forcing, and thus cloud feedback, at all for any step.

It is true in every step of a simulation. And it means that projection uncertainty compounds because every erroneous intermediate climate state is subjected to further simulation error.

This is why the uncertainty in projected air temperature increases so dramatically. The model is step-by-step walking away from initial value knowledge further and further into ignorance.

On an annual average basis, the uncertainty in CF feedback is (+/-)144 times larger than the perturbation to be resolved.

The CF response is so poorly known, that even the first simulation step enters terra incognita.

chris.thompson
September 18, 2019 6:25 am

Pat Frank, you say, “That (+/-)50 C says nothing about what the temperature could actually be. It’s an estimate of what you actually know about the temperature 10 days hence. Namely, nothing.”

All uncertainty estimates, especially unknown future uncertainty estimates, are based on certain assumptions. You are estimating the uncertainty of models of the earth’s future temperature. If your estimate of their uncertainty 5 years from now is so much wider than any actual possible range into which the earth itself could fall, then your method of estimating that uncertainty is incorrect. If the models were themselves to predict changes outside of that range, they would plainly be incorrect also, and for the same reason.

You said yourself that uncertainty is resolved as time passes. The test is whether or not actual events fit well in the middle of the uncertainty range, or deviate widely from it. A good, accurate uncertainty range is one in which, if the experiment is repeated multiple times, the outcome falls within the range of uncertainty 19 times out of 20. If repeated measurements either fall widely outside the predicted uncertainty range, or, the opposite, if they fall in a tight narrow band nowhere near the predicted uncertainty range, then that uncertainty range was wrong.

Your future uncertainty estimates were about the uncertainty of temperature of the earth as predicted by the models. You argue that there were errors in the models such that the future uncertainty range would increase widely, at inverse square rate every year, infinitely. However the reality is that this is not possible. You take this to indicate that the models are wrong. An equally valid interpretation is that the way you calculated the predicted future uncertainty range is wrong. The latter view is mine and that of many scientists around you.

The uncertainty range in the ‘lets predict the temperature of some day in the future based on today’ example reaches a maximum uncertainty value, since there is a time beyond which uncertainty does not, and cannot get any greater, no matter how many days we predict forward. I would be absolutely correct in saying the 95% confidence limit of any prediction going forward even 100 days sits within the standard deviation of all temperatures ever recorded, or close thereto. Whether I predict 20 or 50 or 100 or even 500 days ahead, the actual uncertainty of the prediction cannot get any bigger than that possible range. This is a simple example of bounded uncertainty.

I’m sure there are unbounded examples, where uncertainty limits can increase to the inverse square, infinitely, forever, over time. But this is a good example of one where future uncertainty range is limited by some external bound and cannot increase infinitely over time. There are physical constraints that mean that it cannot.

I do agree with you that the models you criticise become meaningless once their future uncertainty range, projected forward, reaches the maximum possible increase in temperature of the earth. Predicting absolutely impossible future events indicates that the uncertainty range is inappropriate. However, if the models stay, as time passes, well within your theoretical predicted uncertainty range for the models, then your uncertainty range estimated for them was false.

You seem to want to have your cake and eat it. You suggest that the models are wrong *in predicting the earth’s temperature* because they could be out by an extraordinarily wide uncertainty range in only a few years, but at the same time, you say *but my uncertainty range has nothing to do with actual possible predictions of the earths temperature*. You can’t have it both ways. It either is an uncertainty about the earth’s future temperature, or it is not. If it is, then it is a constrained value; if it is not about the earths temperature, then it is irrelevant.

September 18, 2019 5:25 pm

Chris, you wrote, “ If your estimate of their uncertainty 5 years from now is so much wider than any actual possible range into which the earth itself could fall, then your method of estimating that uncertainty is incorrect.

You’re confusing uncertainty with physical measurement, Chris. Uncertainty concerns the reliability of the prediction. It has nothing to do with the actual behavior of the physical system.

Uncertainty in the prediction tells us the reliability of the prediction. It is a measure of how reliable are the prediction methods one has in hand, in this case, climate models. Whatever the physical system does later has no bearing on the validity of the uncertainty estimate.

If the physical system produces values that are very far away from the uncertainty bound, then this means the errors and uncertainties we used to estimate the final uncertainty were not adequate to the job.

Uncertainty bounds are always an interval around the predicted value.

If the physical system produces values that are very far away from the uncertainty bound, this means that the predicted value itself is also very far away from the final state of the physical system.

This would mean the physical model is also very poor because it predicts results that are far away from reality.

It does not mean that the method used to make the uncertainty calculation (root-sum-square) was wrong.

This basic mistake informs your entire analysis. Wrong premise, wrong conclusions.

Chris Thompson
September 19, 2019 8:53 am

Let’s say I have a prediction method for future temperature that I think has narrow uncertainty bounds. And a whole bunch of other people have their own methods, and we all end up with similarly narrow uncertainty bounds. All are well within the ability of the earth to change by the amount predicted; let’s say they are all relatively modest in the extent of their change.

Someone else does some maths and suggest that these models actually have very wide uncertainty bounds. Far wider, in fact, than the earth actually can change in temperature.

After 15 years, it turns out that when we get the actual temperatures of the earth, and compare them to the predictions, the majority – let’s say 49 out of 50 – fit within the narrow uncertainty bounds originally estimated by the people who made the models. None of them come anywhere close to the much wider uncertainty bounds predicted by others.

Which of those uncertainty predictions just got proved to me a more accurate estimate of the true uncertainty of the models?

While you say, “uncertainty has nothing to do with the actual behaviour of the physical system”, thats not correct. The uncertainty of a prediction has everything to do with the actual behaviour of the real physical system, because uncertainty is an estimate of the possible range of the differences between the behaviour of the model and the actual behaviour of the real physical system – the range of likely future errors between the model and the real system. Uncertainty about how a real system may perform in the future cannot be imagined to be separate from the realities of the physical system with which the prediction is ultimately to be compared.

If a proponent of a model were to suggest that its uncertainty bounds lie well out side the acknowledged realm of possible future values of the physical system, that model would not get much traction, unless it was intended to indicate that the acknowledged realm was incorrect. A simple way to attack the utility of a predictive model is to suggest it has wider uncertainty limits than are physically possible. And that’s your premise.

In my example of predicting temperature going forward, if someone said that their estimate of the uncertainty of my future estimation method was +/- 200 degrees centigrade by 100 days ahead, when the greatest temperature range in the last 1000 days was only +/- 20 degrees centigrade, they would need extraordinary evidence that something truly amazing was about to happen, because otherwise the odds are that their uncertainty estimate is much too wide.

Even if my method of future estimation was to randomly take any number anywhere between the lowest and highest temperatures ever recorded, and say. “that’s what it will be in 100 days”, the precision that estimate would be not be much worse than it would be at 10 days. In fact, after a short time, the likely precision of the estimate of any particular temperature would only depend on the temperature value predicted, not the time ahead, since the ranges would not be symmetrically distributed for values further away from the mean, and each temperature would have a reduced probability of occurrence the further it was away from the mean. It becomes apparent that the true uncertainty of the method depends only, after a time, on the value selected as the predicted future value, not the duration over which the prediction is made in advance. By your logic, the uncertainty always grows and grows, infinitely; however, in a model of this kind, it actually does not.

Having found a good example of a simple model in which your approach can be shown to fail, you need to admit that there are some systems in which your approach can only lead to incorrect conclusions.

The error you make is apply a compounding errors method using inverse squares may well be appropriate for some unbounded systems, but it is simply the wrong method to use for calculating the uncertainty of the future conditions of bounded systems.

The simplest example is some uncertainty measurement of the location of a gas molecule over time. The range of uncertainty about its future position, compared to its present position, gets infinitely greater over time by some constant and a least squares model of time. But if we put that molecule in a box that it cannot escape from, suddenly our prediction model needs to change. That is the difference between predicting uncertainty in a bounded vs an unbounded system. Since the earths temperature is bounded, uncertainty predictions like yours. that sit way outside those bounds are meaningless.

Reply to  Chris Thompson
September 20, 2019 7:32 pm

Chris Thompson, your supposed model of making predictions and then seeing how they turn out is analogous to someone going into the future to see how the cards appear in a gambling casino, and then going back in time and saying that the game of chance has no uncertainty.

Your model is ludicrous.

You wrote, “… because uncertainty is an estimate of the possible range of the differences between the behaviour of the model and the actual behaviour of the real physical system.

No, it is not. Uncertainty is the estimate of the predictive reliability of the model. It has nothing whatever to do with the final magnitude of error.

You continue to make the same mistake. Uncertainty is not error.

You wrote, “A simple way to attack the utility of a predictive model is to suggest it has wider uncertainty limits than are physically possible. And that’s your premise.

Uncertainty bounds wider than physical possibility means that the prediction has no physical meaning. It means the model is predictively useless. That is not my premise. That is the analytical meaning of uncertainty.

I made a post here of material from published literature about the meaning of uncertainty. It is mostly from engineering journals, with links to the literature. Try looking at that, Chris. You’ll discover that your conception is entirely wrong.

Here is a small quote from S. J. Kline (1985) The Purposes of Uncertainty Analysis. Journal of Fluids Engineering 107(2), 153-160.

From the paper: “An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable. (my bold)”

That quote alone refutes your entire position. I’m not going to dispute with you after this, Chris. There is no point in continuing to discuss with someone who insists on a wrong argument.

You wrote, “ … but it is simply the wrong method to use for calculating the uncertainty of the future conditions of bounded systems.

Once again you are confusing uncertainty with error. This is the core problem. You continue to argue from the wrong position. Uncertainty grows without bound. When uncertainty bounds of a prediction are beyond the range of a physically bounded system, it means that the prediction has no physical meaning.

You need to study more, Chris. Try applying yourself to the papers abstracted in the post I linked. You’ll become a better scientist (as I did).

Alasdair Fairbairn
September 20, 2019 10:33 am

To me the basic error in these models stems from either ignoring or assuming that there are no mechanisms in the science other than radiation that influence the climate equilibrium budget.

It is interesting to note that the TOA energy equilibrium is deemed (or forced) to be fixed in these models and that all of them “respond to increasing greenhouse gases”; this resulting in behaving similarly with respect to climate sensitivity and ocean heat uptake.

Analysis of the thermodynamic behaviour of water, particularly at evaporative Phase Change shows that there is a strong mechanism here which results in the transport of large energies (some 694 WattHrs/Kg) up through the atmosphere due to the inherent buoyancy** of the vapor, oblivious of GHGs such as CO2 for dissipation on the way up and to space. Thus providing a strong influence on TOA energy balance.
Further this process takes place at constant temperature and thus has a zero value sensitivity coefficient (S) in the Planck equation dF = S*dT. which, if ignored in the calculation of the Global Sensitivity would lead to an overestimate of its value; hence the models running HOT.
The thermodynamics also affects the calculation of ocean heat uptake; so evident in the fact that we all sweat to keep cool.
IMO this simple omission, albeit very complex in cloud structures, is a root cause of much of the problems.

** Note: I find that in all the literature on the subject, both sceptic and otherwise references to Convection, rarely if ever, include this buoyancy factor which is an entirely different mechanism and does not depend on temperature differential.
A vital distinction when considering clouds.