Guest post by Pat Frank
Readers of Watts Up With That will know from Mark I that for six years I have been trying to publish a manuscript with the post title. Well, it has passed peer review and is now published at Frontiers in Earth Science: Atmospheric Science. The paper demonstrates that climate models have no predictive value.
Before going further, my deep thanks to Anthony Watts for giving a voice to independent thought. So many have sought to suppress it (freedom denialists?). His gift to us (and to America) is beyond calculation. And to Charles the moderator, my eternal gratitude for making it happen.
Onward: the paper is open access. It can be found here , where it can be downloaded; the Supporting Information (SI) is here (7.4 MB pdf).
I would like to publicly honor my manuscript editor Dr. Jing-Jia Luo, who displayed the courage of a scientist; a level of professional integrity found lacking among so many during my 6-year journey.
Dr. Luo chose four reviewers, three of whom were apparently not conflicted by investment in the AGW status-quo. They produced critically constructive reviews that helped improve the manuscript. To these reviewers I am very grateful. They provided the dispassionate professionalism and integrity that had been in very rare evidence within my prior submissions.
So, all honor to the editors and reviewers of Frontiers in Earth Science. They rose above the partisan and hewed the principled standards of science when so many did not, and do not.
A digression into the state of practice: Anyone wishing a deep dive can download the entire corpus of reviews and responses for all 13 prior submissions, here (60 MB zip file, Webroot scanned virus-free). Choose “free download” to avoid advertising blandishment.
Climate modelers produced about 25 of the prior 30 reviews. You’ll find repeated editorial rejections of the manuscript on the grounds of objectively incompetent negative reviews. I have written about that extraordinary reality at WUWT here and here. In 30 years of publishing in Chemistry, I never once experienced such a travesty of process. For example, this paper overturned a prediction from Molecular Dynamics and so had a very negative review, but the editor published anyway after our response.
In my prior experience, climate modelers:
· did not know to distinguish between accuracy and precision.
· did not understand that, for example, a ±15 C temperature uncertainty is not a physical temperature.
· did not realize that deriving a ±15 C uncertainty to condition a projected temperature does *not* mean the model itself is oscillating rapidly between icehouse and greenhouse climate predictions (an actual reviewer objection).
· confronted standard error propagation as a foreign concept.
· did not understand the significance or impact of a calibration experiment.
· did not understand the concept of instrumental or model resolution or that it has empirical limits
· did not understand physical error analysis at all.
· did not realize that ‘±n’ is not ‘+n.’
Some of these traits consistently show up in their papers. I’ve not seen one that deals properly with physical error, with model calibration, or with the impact of model physical error on the reliability of a projected climate.
More thorough-going analyses have been posted up at WUWT, here, here, and here, for example.
In climate model papers the typical uncertainty analyses are about precision, not about accuracy. They are appropriate to engineering models that reproduce observables within their calibration (tuning) bounds. They are not appropriate to physical models that predict future or unknown observables.
Climate modelers are evidently not trained in the scientific method. They are not trained to be scientists. They are not scientists. They are apparently not trained to evaluate the physical or predictive reliability of their own models. They do not manifest the attention to physical reasoning demanded by good scientific practice. In my prior experience they are actively hostile to any demonstration of that diagnosis.
In their hands, climate modeling has become a kind of subjectivist narrative, in the manner of the critical theory pseudo-scholarship that has so disfigured the academic Humanities and Sociology Departments, and that has actively promoted so much social strife. Call it Critical Global Warming Theory. Subjectivist narratives assume what should be proved (CO₂ emissions equate directly to sensible heat), their assumptions have the weight of evidence (CO₂ and temperature, see?), and every study is confirmatory (it’s worse than we thought).
Subjectivist narratives and academic critical theories are prejudicial constructs. They are in opposition to science and reason. Over the last 31 years, climate modeling has attained that state, with its descent into unquestioned assumptions and circular self-confirmations.
A summary of results: The paper shows that advanced climate models project air temperature merely as a linear extrapolation of greenhouse gas (GHG) forcing. That fact is multiply demonstrated, with the bulk of the demonstrations in the SI. A simple equation, linear in forcing, successfully emulates the air temperature projections of virtually any climate model. Willis Eschenbach also discovered that independently, awhile back.
After showing its efficacy in emulating GCM air temperature projections, the linear equation is used to propagate the root-mean-square annual average long-wave cloud forcing systematic error of climate models, through their air temperature projections.
The uncertainty in projected temperature is ±1.8 C after 1 year for a 0.6 C projection anomaly and ±18 C after 100 years for a 3.7 C projection anomaly. The predictive content in the projections is zero.
In short, climate models cannot predict future global air temperatures; not for one year and not for 100 years. Climate model air temperature projections are physically meaningless. They say nothing at all about the impact of CO₂ emissions, if any, on global air temperatures.
Here’s an example of how that plays out.
Panel a: blue points, GISS model E2-H-p1 RCP8.5 global air temperature projection anomalies. Red line, the linear emulation. Panel b: the same except with a green envelope showing the physical uncertainty bounds in the GISS projection due to the ±4 Wm⁻² annual average model long wave cloud forcing error. The uncertainty bounds were calculated starting at 2006.
Were the uncertainty to be calculated from the first projection year, 1850, (not shown in the Figure), the uncertainty bounds would be very much wider, even though the known 20th century temperatures are well reproduced. The reason is that the underlying physics within the model is not correct. Therefore, there’s no physical information about the climate in the projected 20th century temperatures, even though they are statistically close to observations (due to model tuning).
Physical uncertainty bounds represent the state of physical knowledge, not of statistical conformance. The projection is physically meaningless.
The uncertainty due to annual average model long wave cloud forcing error alone (±4 Wm⁻²) is about ±114 times larger than the annual average increase in CO₂ forcing (about 0.035 Wm⁻²). A complete inventory of model error would produce enormously greater uncertainty. Climate models are completely unable to resolve the effects of the small forcing perturbation from GHG emissions.
The unavoidable conclusion is that whatever impact CO₂ emissions may have on the climate cannot have been detected in the past and cannot be detected now.
It seems Exxon didn’t know, after all. Exxon couldn’t have known. Nor could anyone else.
Every single model air temperature projection since 1988 (and before) is physically meaningless. Every single detection-and-attribution study since then is physically meaningless. When it comes to CO₂ emissions and climate, no one knows what they’ve been talking about: not the IPCC, not Al Gore (we knew that), not even the most prominent of climate modelers, and certainly no political poser.
There is no valid physical theory of climate able to predict what CO₂ emissions will do to the climate, if anything. That theory does not yet exist.
The Stefan-Boltzmann equation is not a valid theory of climate, although people who should know better evidently think otherwise including the NAS and every US scientific society. Their behavior in this is the most amazing abandonment of critical thinking in the history of science.
Absent any physically valid causal deduction, and noting that the climate has multiple rapid response channels to changes in energy flux, and noting further that the climate is exhibiting nothing untoward, one is left with no bearing at all on how much warming, if any, additional CO₂ has produced or will produce.
From the perspective of physical science, it is very reasonable to conclude that any effect of CO₂ emissions is beyond present resolution, and even reasonable to suppose that any possible effect may be so small as to be undetectable within natural variation. Nothing among the present climate observables is in any way unusual.
The analysis upsets the entire IPCC applecart. It eviscerates the EPA’s endangerment finding, and removes climate alarm from the US 2020 election. There is no evidence whatever that CO₂ emissions have increased, are increasing, will increase, or even can increase, global average surface air temperature.
The analysis is straight-forward. It could have been done, and should have been done, 30 years ago. But was not.
All the dark significance attached to whatever is the Greenland ice-melt, or to glaciers retreating from their LIA high-stand, or to changes in Arctic winter ice, or to Bangladeshi deltaic floods, or to Kiribati, or to polar bears, is removed. None of it can be rationally or physically blamed on humans or on CO₂ emissions.
Although I am quite sure this study is definitive, those invested in the reigning consensus of alarm will almost certainly not stand down. The debate is unlikely to stop here.
Raising the eyes, finally, to regard the extended damage: I’d like to finish by turning to the ethical consequence of the global warming frenzy. After some study, one discovers that climate models cannot model the climate. This fact was made clear all the way back in 2001, with the publication of W. Soon, S. Baliunas, S. B. Idso, K. Y. Kondratyev, and E. S. Posmentier Modeling climatic effects of anthropogenic carbon dioxide emissions: unknowns and uncertainties. Climate Res. 18(3), 259-275, available here. The paper remains relevant.
In a well-functioning scientific environment, that paper would have put an end to the alarm about CO₂ emissions. But it didn’t.
Instead the paper was disparaged and then nearly universally ignored (Reading it in 2003 is what set me off. It was immediately obvious that climate modelers could not possibly know what they claimed to know). There will likely be attempts to do the same to my paper: derision followed by burial.
But we now know this for a certainty: all the frenzy about CO₂ and climate was for nothing.
All the anguished adults; all the despairing young people; all the grammar school children frightened to tears and recriminations by lessons about coming doom, and death, and destruction; all the social strife and dislocation. All the blaming, all the character assassinations, all the damaged careers, all the excess winter fuel-poverty deaths, all the men, women, and children continuing to live with indoor smoke, all the enormous sums diverted, all the blighted landscapes, all the chopped and burned birds and the disrupted bats, all the huge monies transferred from the middle class to rich subsidy-farmers.
All for nothing.
There’s plenty of blame to go around, but the betrayal of science garners the most. Those offenses would not have happened had not every single scientific society neglected its duty to diligence.
From the American Physical Society right through to the American Meteorological Association, they all abandoned their professional integrity, and with it their responsibility to defend and practice hard-minded science. Willful neglect? Who knows. Betrayal of science? Absolutely for sure.
Had the American Physical Society been as critical of claims about CO₂ and climate as they were of claims about palladium, deuterium, and cold fusion, none of this would have happened. But they were not.
The institutional betrayal could not be worse; worse than Lysenkoism because there was no Stalin to hold a gun to their heads. They all volunteered.
These outrages: the deaths, the injuries, the anguish, the strife, the malused resources, the ecological offenses, were in their hands to prevent and so are on their heads for account.
In my opinion, the management of every single US scientific society should resign in disgrace. Every single one of them. Starting with Marcia McNutt at the National Academy.
The IPCC should be defunded and shuttered forever.
And the EPA? Who exactly is it that should have rigorously engaged, but did not? In light of apparently studied incompetence at the center, shouldn’t all authority be returned to the states, where it belongs?
And, in a smaller but nevertheless real tragedy, who’s going to tell the so cynically abused Greta? My imagination shies away from that picture.
An Addendum to complete the diagnosis: It’s not just climate models.
Those who compile the global air temperature record do not even know to account for the resolution limits of the historical instruments, see here or here.
They have utterly ignored the systematic measurement error that riddles the air temperature record and renders it unfit for concluding anything about the historical climate, here, here and here.
These problems are in addition to bad siting and UHI effects.
The proxy paleo-temperature reconstructions, the third leg of alarmism, have no distinct relationship at all to physical temperature, here and here.
The whole AGW claim is built upon climate models that do not model the climate, upon climatologically useless air temperature measurements, and upon proxy paleo-temperature reconstructions that are not known to reconstruct temperature.
It all lives on false precision; a state of affairs fully described here, peer-reviewed and all.
Climate alarmism is artful pseudo-science all the way down; made to look like science, but which is not.
Pseudo-science not called out by any of the science organizations whose sole reason for existence is the integrity of science.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.

In accountancy one has the interesting phenomenon of multiple ‘compensating’ errors self cancelling so that one thinks the accounts are correct when they are not.
This is similar.
Many aspects of the climate are currently unquantifiable so multiple potentially inaccurate parameters are inserted into the starting scenario.
That starting scenario is then tuned to match real world observations but it contains all those multiple compensating errors.
Each one of those errors then compounds with the passage of time and the degree of compensating between the various errors may well vary.
The fact is that over time the inaccurate net effect of the errors accumulates faster and faster with rapidly reducing prospects of unravelling the truth.
Climate models are like a set of accounts stuffed to the brim with errors that sometimes offset and sometimes compound each other such that with the passing of time the prospect of unravelling the mess reduces exponentially.
Pat Frank is explaining that in mathematical terms but given the confusion here maybe it is best to simply rely on verbal conceptual imagery to get the point across.
Climate models are currently worthless and dangerous.
(Found in the spam bin) SUNMOD
A great description of what is going on. In other words, two wrongs don’t make a right or “x” wrongs don’t make a right.
You’ve outlined the problem well, Stephen.
Apart from my reviewers, I’ve not encountered a climate modeler who understands it.
Pat has emphasised (repeatedly) that ‘uncertainty’ should not be construed as a temperature value. My reading of Roy Spencer’s rebuttal was that he had done just that, and Pat describes it as a basic mistake which he sees being made by many who should know better. We need to have some acknowledgement and resolution on this distinction before we can move forward in any meaningful way. One party has to be wrong and the other right here. Which is it? I’m with Pat
Thank-you, BB.
A statistic is never an actual physical magnitude.
Roy Spencer thinks that the calibration error statistic, ±4 W/m^2, is an energy.
An important exclusion.
Did not contain physical equations that showed any of the warming was related to CO2.
If this was possible it would had been put in writing for all scientists to see regarding actual scientific theory equations that could be tested via scientific method.
Pat Frank’s paper makes scientific predictions which I believe are proper in the sense of being falsifiable. I was thinking about how a CMIP5 model would evolve over time, and how much variance it would have compared to Pat’s predictions, and then it struck me that we should ask the owners to run those models, without tinkering except to provide multiple initial states as desired, and see what happens. Let’s set them up starting with this year’s data, 2019, and run for 81 years to 2100. After all, we do want to know how big that “uncertainty monster” is.
I predict that the standard deviation of model predictions at 2100, which is what I take model uncertainty to mean, will be a lot* less than Pat predicts. Likewise it would be interesting to see whether the uncertainty when using an ensemble of n models is reduced by sqrt(n) or if cross-model correlation makes it worse.
But I may be wrong in my prediction and Pat may be right. Let’s see!
* If this were to go ahead I would attempt to come up with a number to replace “lot”.
See, O2 Rich, your comment shows that you have no concept of the difference between error and uncertainty.
I just received this email. I’ll share it anonymously, to give you the flavor of comments that apply to your comment, Rich.
+++++++++
Dr. Frank,
I am very glad to see you got your climate model paper published.
Your earlier YouTube talk was my first insight into to just how bad the climate community was in their understanding of uncertainty propagation and even their ignorance of variance and how it adds (shaking my head sadly). As a statistician (MS) before my training in experimental Biology, your talk and the new paper clearly state the central flaw; though I also believe that their continuous retrospective tweaking is nothing more than rampant over-fitting, which pretty much makes the whole endeavor fraudulent.
It is appalling that these (pseudo-)scientists seem to have no training in probability and statistics.
One suggestion with respect to answering the various comments I’ve read on the web in response to your paper: perhaps, it would help to remind them that the the unknowable changes due to sampling any particular (future) year’s actual values result in a random walk into the future, and remind them that a 95% error bound just encloses all but 5% of the possible such walk outcomes. Based on the comments, I doubt that most of the readers even understand what error bounds mean,
let alone understand how to propagate sequential measurement errors vs. ensemble estimates of model precision.
THANK YOU for hanging in there and getting this paper out into the literature.
Cheers!
++++++++++++
Well done. I had reached the conclusion the models should be scrapped by checking their predictions with the data more than a decade ago. You provide mathematical proof that they are no more than maps of data. A map is not predictive. Would one trust the map of the Sahara beyond its bounds?
I believe anna v is an experimental particle physicist.
Someone who knows far more than I do about physical error analysis.
Pat, I have generally found that if I do not understand something about mathematics or physics then I am not the only one around, even among highly qualified people. So please enlighten us: what >is< the difference between error and uncertainty? And, what exactly can your paper tell us about the possible predictions of GCMs out to 2100? Your Figure 7b suggests +/-20K roughly; is that a 1-sigma or a 2-sigma envelope, or is my ignorance deluding me into thinking that a sigma (standard deviation) has any menaing there. Finally, is your theory falsifiable?
Perhaps this is too simple, but variance and standard deviation are statistical calculations of the range of data and how far each data point (or a collection of data points, i.e. x sigmas) can range from the mean, 1) This requires either a full population or a sample of a population of random variables. When measuring the same thing with the same device many, many times you can use the population of data points to develop statistics about what a “true value” of the measurement should be. However, it doesn’t necessarily describe accuracy, only precision. An inaccurate device can deliver precise measurement
Uncertainty is a different animal. I always consider it as a measurement of how accurate a result is. It tells you how systemic and random error, processes, or calculations can combine and possibly affect the accuracy of the result
Jim Gorman
I would describe it differently. Accuracy is how close a measurement is to the correct or true value. That might be determined by comparison with a standard, or from a theoretical calculation.
Uncertainty is the expression of how confident a measurement is accurate when it is impossible to determine the true value. That is, it can be expressed as a probability envelope which surrounds the measurement (or calculation in this case). One can say, for example, that it is thought that there is 95% (or 68%) probability that the measurement/calculation falls within the bounds of the envelope. What Pat is demonstrating is that the bounds are so wide that the prediction/forecast has no practical application or utility. If I told you that something had a value between zero and infinity, would you really know anything useful about the magnitude?
A very nice encapsulation, Clyde.
Far more economical of language than I was able to attain. 🙂
Rich, thanks for your question.
Error is the difference between a predicted observable and the observable itself; essentially a measurement.
Measurement minus prediction = error (in the prediction). Readily calculated.
Uncertainty is the resolution of the prediction from a model (or an instrumental measurement). How certain are the elements going into the prediction?
In a prediction of a future state, clearly an error cannot be calculated because there are no observables.
If the magnitudes of the predictive elements within a model are poorly constrained, then they can produce a range of magnitudes in whatever state is being predicted.
In that case, there is an interval of possible values around the prediction. That interval need not be normally distributed. It could be skewed. The prediction mean is then not necessarily the most probable value.
Although the error in the prediction of a future state cannot be known, the lack of resolution in the predictive model can be known. It is derived by way of a calibration experiment.
A calibration uses the predictive system, the model, to predict a known test case. Typically the test case is a known and relevant observable.
The test observable minus the prediction calibrates the accuracy of the predictive system (the model).
Multiple calibration experiments typically yield a series of test predictions of the known test observable. These allow calculation of a model error interval around the known true value.
The error interval is the calibration error statistic. It is a plus/minus value that conditions all the future predictions made using that predictive model.
When the theory deployed by the model is deficient, the calibration error is systematic. Systematic error can also arise from uncontrolled variables.
Suppose the calibration experiment reveals that the accuracy of the predictive model is poor.
The model then is known to predict an observable as a mean plus or minus an interval of uncertainty about that mean revealed by the now-known calibration accuracy width.
That width provides the uncertainty when the model is used to predict a future state.
That is, when the real futures prediction is made, the now known lack of accuracy in the predictive model gets propagated through the calculations into the futures prediction.
The predictive uncertainty derives from the calibration error statistic propagated through the calculations made by the model. The typical mode of calibration error propagation is root-sum-square through the calculational steps.
The resulting total uncertainty in the prediction is a reliability statistic, not a physical error.
The uncertainty conditions the reliability of the prediction. Tight uncertainty bounds = reliable prediction. Wide uncertainty bounds = unreliable prediction.
Uncertainty is like the pixel size of the prediction. Big pixels = fuzzed out picture. Tiny pixels = good detail.
That is, uncertainty is a resolution limit.
A model can calculate up a discrete magnitude for some future state observable, such as future air temperature.
However, if the predictive uncertainty is large, the prediction can have no physical meaning and the predicted magnitude conveys no information about the future state.
Large predictive uncertainty = low resolution = large pixel size.
Calibration error indicates that error is present in every calculational step. However, in a climate simulation the size of the error is unknown because the steps are projections into the future. So, all that is known, is that the phase-space trajectory of the calculated states wanders in some unknown way, away from the trajectory of the physically real system.
At the end, one does not know the relative positions of the simulated state and the physically correct state. All one has to go on is the propagated uncertainty in the simulation.
Pat,
Your brilliant answer to a great question is clearly a distillation of years of experience.
Thanks Philip 🙂
I hope it’s useful.
Pat, thank you for your expansive reply. The only thing which would help me further would be some mathematical definitions, but I can try to work those out.
However, you only answered one of my questions. Another one was “is your theory falsifiable”, that is, is there any real or computer experiment which could be done to falsify it, or verify it by absence of falsification?
I’m going on holiday now, so may not see your answer for a while.
Thanks, Rich.
Rich, the only way to falsify a physical error analysis that indicates wide model uncertainty bounds, is to show highly accuracy for the models.
I doubt that can happen any time soon.
However, that doesn’t obviate future advances in climate physics that lead to better, more accurate physical models.
I have said for years if I did my science labs for my BSME like they do climate science I would have flunked out and this covers one of the main reasons. Sad the idea of how errors propagate and how error bars need to be found sees totally missed in most all climate work. Also sad they tend to do linear analysis for things that are not linear.
Very glad you got this published and perhaps it will get people doing better works. Seems that many would benefit for learning things like how tolerances stack in mfg. and how gage rr is calculated along with what it means. Perhaps then they could apply it to their work.
Various bad analogies have been offered here, so I am going to offer yet another one to try to explain what is going on.
We return to a building site in East Anglia in the year 1300 AD. The local climate scientist, a Monsieur Mann, gives the Anglo-Saxon freeman foreman a list of instructions written in French. After some confusion, it turned out that M. Mann wanted two pegs to be put into the ground, exactly 340.25 metres apart. He then wanted the foreman to get his two serfs to cut a series of lengths of wood, some in red and some in blue, and lay them out on the ground. The specifications were in cms, but the serfs only had a yardstick marked off in inches.
The serfs do their best, and the following day, Monsieur Mann returns to see their work.
“Putain”, il dit. “Vous avez toujours nié l’existence des changements climatiques?” (Incomprehensible in modern translation.)
Mann patiently explains that the distance between the two pegs represents the averaged incoming solar radiation before any reflection.
He has done his dimensional analysis and has converted the distance using a factor of 1 m^3/Watt.
He further explains that the red sticks all represent outgoing LW radiation from different parts of the system. They must be laid end to end starting from the first peg. The blue sticks represent reflected and transmitted SW, and they must be laid end-to-end from the second peg towards the first peg. After further vulgar language and a baseless accusation that they had probably been corrupted by the oil-lamp makers to destroy his work, Monsieur Mann explains that the red and the blue sticks should meet exactly somewhere between the pegs, since the total outgoing LW radiation should be equal to the total solar input less the reflected radiation. He leaves the site muttering under his breath about the corruption of the oil-lamp manufacturers.
After trying to make this work for a while, one of the serfs touches his forelock to the foreman and points out that the red and the blue sticks don’t reach each other. There is a gap of about 4 metres between them. One of the serfs unhelpfully suggests that either they have measured the lengths badly or Monsieur Mann must have made a mistake. The foreman knew that accusing M. Mann of making a mistake was a sure way to the stocks, so he told the serfs to cut all of the red sticks a little longer. When they had done this, strangely there was still a gap of 4 meters, so the foreman took one of the sticks which was mysteriously labeled “le forcing de nuage numero 4”, and he recut it so that the blue sticks and the red sticks arrived at exactly the same place.
When M. Mann arrived the following day, he had brought with him a large bunch of green sticks, which were wrapped in a tapestry with an etiquette saying “forcings”. He examined the blue and red sticks laid out between the pegs and noted with satisfaction that they touched exactly, and subsequently wrote that he therefore had “high confidence” in his results which were “extremely likely”.
He then laid out his green sticks one at a time. Each of them touched the second peg in the ground. He then carefully measured the distance from the first peg to the end of his green stick and subtracted the distance between the pegs. By this method he was able to calculate the (cumulative forcing) for each year. One of the serfs who was more intelligent than the other asked why they had bothered with all this work, since M. Mann would get a more accurate measure of his “forcings” by just measuring the length of his green sticks. The foreman told him not to question his betters.
All was well until about 700 years later when an archaeologist of Irish descent from the Frankish empire, known to his friends as Pat the Frank, found the site in a remarkably well preserved condition. He understood its significance from archaic documents, and measured the length of the best preserved sticks. When he measured the length of the remarkably well preserved stick labeled “le forcing de nuage numero 4”, he compared it with satellite-derived estimates of LW Cloud Forcing and discovered that it was in error by about 12%. He declared that this must then introduce an uncertainty into the length of M. Mann’s green sticks. Was he correct?
I’ve posted these comments to blogs by Dr. Roy Spencer, and add them here for additional exposure.
There seems to be a misunderstanding afoot in the interpretation of the description of uncertainty in iterative climate models. I offer the following examples in the hopes that they clear up some of the mistaken notions apparently driving these erroneous interpretations.
Uncertainty: Describing uncertainty for human understanding is fraught with difficulties, evidence being the lavish casinos that persuade a significant fraction of the population that you can get something from nothing. There are many other examples, some clearer that others, but one successful description of uncertainty is that of the forecast of rain. We know that a 40% chance of rain does not mean it will rain everywhere 40% of the time, nor does it mean that it will rain all of the time in 40% of the places. We however intuitively understand the consequences of comparison of such a forecast with a 10% or a 90% chance of rain.
Iterative Models: Let’s assume we have a collection of historical daily high temperature data for a single location, and we wish to develop a model to predict the daily high temperature at that location on some date in the future. One of the simplest, yet effective, models that one can use to predict tomorrow’s high temperature is to use today’s high temperature. This is the simplest of models, but adequate for our discussion of model uncertainty. Note that at no time will we consider instrument issues such as accuracy, precision and resolution. For our purposes, those issues do not confound the discussion below.
We begin by predicting the high temperatures from the historical data from the day before. (The model is, after all, merely a single day offset) We then measure model uncertainty, beginning by calculating each deviation, or residual (observed minus predicted). From these residuals, we can calculate model adequacy statistics, and estimate the average historical uncertainty that exists in this model. Then, we can use that statistic to estimate the uncertainty in a single-day forward prediction.
Now, in order to predict tomorrow’s high temperature, we apply the model to today’s high temperature. From this, we have an “exact” predicted value ( today’s high temperature). However, we know from applying our model to historical data, that, while this prediction is numerically exact, the actual measured high temperature tomorrow will be a value that contains both deterministic and random components of climate. The above calculated model (in)adequacy statistic will be used to create an uncertainty range around this prediction of the future. So we have a range of ignorance around the prediction of tomorrow’s high temperature. At no time is this range an actual statement of the expected temperature. This range is similar to % chance of rain. It is a method to convey how well our model predicts based on historical data.
Now, in order to predict out two days, we use the “predicted” value for tomorrow (which we know is the same numerical value as today, but now containing uncertainty ) and apply our model to the uncertain predicted value for tomorrow. The uncertainty in the input for the second iteration of the model cannot be ‘canceled out’ before the number is used as input to the second application model. We are, therefore, somewhat ignorant of what the actual input temperature will be for the second round. And that second application of the model adds its ignorance factor to the uncertainty of the predicted value for two days out, lessening the utility of the prediction as an estimate of day-after-tomorrow’s high temperature. This repeats so that for predictions for several days out, our model is useless in predicting what the high temperature actually will be.
This goes on for each step, ever increasing the ignorance and lessening the utility of each successive prediction as an estimate of that day’s high temperature, due to the growing uncertainty.
This is an unfortunate consequence of the iterative nature of such models. The uncertainties accumulate. They are not biases, which are signal offsets. We do not know what the random error will be until we collect the actual data for that step, so we are uncertain of the value to use in that step when predicting.
To Pat Frank
What is don’t understand is why you needed to show that all existing models can be emulated with an expression which is linearly dependent on CO2 concentration. Would it not have been easier to input the +/-4 Wm–2 error into the various models to calculate the absolute error (extreme points) using a differential method using the average value of −27.6 Wm–2 CF value as reference to determine the relative error? See this link (http://www.animations.physics.unsw.edu.au/sf/toolkits/Errors_and_Error_Estimation.pdf) to calculate error for y=f(x) on page xxii ? Was the problem that the complex math in these models makes calculation of error propagation too difficult? For a sum the absolute errors are relevant, for a product or quotient the relative errors are, and for a complex mathematical function it becomes quite complicated… then the iterations add even more to the complexity. What certainly is true is that +/-4 Wm^–2 error is huge compared to the 0.035 Wm–2 CO2 effect the climate modelers want to resolve. I also believe that cloud formation is key. Henrik Svensmark’s data was also hindered from publication for years.
Eric Viera, if you want to do a differential methodological analysis of climate model physics, go right ahead.
The fact that the air temperature projection of any climate model can be successfully emulated using a simple expression linear on fractional change in GHG forcing was a novel result all by itself.
That demonstration opened the models to a reliability analysis based on linear propagation of error.
What would be the point of doing your very difficult error analysis, when a straight-forward error analysis provides the information needed for a judgment?
So, please do go ahead and carry out your analysis. Publish the results. Until then, we can all wonder whether it will yield a different judgment.
Here is a simple example. Let us say you take 1000 measurements one time and get a distribution of values that is 15 with a std deviation of 1. Lets say you so this for 25 years and each time you get an average of 15 with a std. deviation of 1. So, what is the std deviation for the time series of measurements, 25 (Pat Frank) or 1 (Nick and Eli)?
Best
Really foolish comment, Eli.
Your example has nothing to do with any part of my analysis.
For the benefit of all, I’ve put together an extensive post that provides quotes, citations, and URLs for a variety of papers — mostly from engineering journals, but I do encourage everyone to closely examine Vasquez and Whiting — that discuss error analysis, the meaning of uncertainty, uncertainty analysis, and the mathematics of uncertainty propagation.
These papers utterly support the error analysis in “Propagation of Error and the Reliability of Global Air Temperature Projections.”
Summarizing: Uncertainty is a measure of ignorance. It is derived from calibration experiments.
Multiple uncertainties propagate as root sum square. Root-sum-square has positive and negative roots (+/-). Never anything else, unless one wants to consider the uncertainty absolute value.
Uncertainty is an ignorance width. It is not an energy. It does not affect energy balance. It has no influence on TOA energy or any other magnitude in a simulation, or any part of a simulation, period.
Uncertainty does not imply that models should vary from run to run, Nor does it imply inter-model variation. Nor does it necessitate lack of TOA balance in a climate model.
For those who are scientists and who insist that uncertainty is an energy and influences model behavior (none of you will be engineers), or that a (+/-)uncertainty is a constant offset, I wish you a lot of good luck because you’ll not get anywhere.
For the deep-thinking numerical modelers who think rmse = constant offset or is a correlation: you’re wrong.
The literature follows:
Moffat RJ. Contributions to the Theory of Single-Sample Uncertainty Analysis. Journal of Fluids Engineering. 1982;104(2):250-8.
“Uncertainty Analysis is the prediction of the uncertainty interval which should be associated with an experimental result, based on observations of the scatter in the raw data used in calculating the result.
Real processes are affected by more variables than the experimenters wish to acknowledge. A general representation is given in equation (1), which shows a result, R, as a function of a long list of real variables. Some of these are under the direct control of the experimenter, some are under indirect control, some are observed but not controlled, and some are not even observed.
R=R(x_1,x_2,x_3,x_4,x_5,x_6, . . . ,x_N)
It should be apparent by now that the uncertainty in a measurement has no single value which is appropriate for all uses. The uncertainty in a measured result can take on many different values, depending on what terms are included. Each different value corresponds to a different replication level, and each would be appropriate for describing the uncertainty associated with some particular measurement sequence.
The Basic Mathematical Forms
The uncertainty estimates, dx_i or dx_i/x_i in this presentation, are based, not upon the present single-sample data set, but upon a previous series of observations (perhaps as many as 30 independent readings) … In a wide-ranging experiment, these uncertainties must be examined over the whole range, to guard against singular behavior at some points.
Absolute Uncertainty
x_i = (x_i)_avg (+/-)dx_i
Relative Uncertainty
x_i = (x_i)_avg (+/-)dx_i/x_i
Uncertainty intervals throughout are calculated as (+/-)sqrt[(sum over (error)^2].
The uncertainty analysis allows the researcher to anticipate the scatter in the experiment, at different replication levels, based on present understanding of the system.
The calculated value dR_0 represents the minimum uncertainty in R which could be obtained. If the process were entirely steady, the results of repeated trials would lie within (+/-)dR_0 of their mean …”
Nth Order Uncertainty
The calculated value of dR_N, the Nth order uncertainty, estimates the scatter in R which could be expected with the apparatus at hand if, for each observation, every instrument were exchanged for another unit of the same type. This estimates the effect upon R of the (unknown) calibration of each instrument, in addition to the first-order component. The Nth order calculations allow studies from one experiment to be compared with those from another ostensibly similar one, or with “true” values.”
Here replace, “instrument” with ‘climate model.’ The relevance is immediately obvious. An Nth order GCM calibration experiment averages the expected uncertainty from N models and allows comparison of the results of one model run with another in the sense that the reliability of their predictions can be evaluated against the general dR_N.
Continuing: “The Nth order uncertainty calculation must be used wherever the absolute accuracy of the experiment is to be discussed. First order will suffice to describe scatter on repeated trials, and will help in developing an experiment, but Nth order must be invoked whenever one experiment is to be compared with another, with computation, analysis, or with the “truth.”
Nth order uncertainty, “
*Includes instrument calibration uncertainty, as well as unsteadiness and interpolation.
*Useful for reporting results and assessing the significance of differences between results from different experiment and between computation and experiment.
The basic combinatorial equation is the Root-Sum-Square:
dR = sqrt[sum over((dR_i/dx_i)*dx_i)^2]”
https://doi.org/10.1115/1.3241818
Moffat RJ. Describing the uncertainties in experimental results. Experimental Thermal and Fluid Science. 1988;1(1):3-17.
“The error in a measurement is usually defined as the difference between its true value and the measured value. … The term “uncertainty” is used to refer to “a possible value that an error may have.” … The term “uncertainty analysis” refers to the process of estimating how great an effect the uncertainties in the individual measurements have on the calculated result.
THE BASIC MATHEMATICS
This section introduces the root-sum-square (RSS) combination (my bold), the basic form used for combining uncertainty contributions in both single-sample and multiple-sample analyses. In this section, the term dX_i refers to the uncertainty in X_i in a general and nonspecific way: whatever is being dealt with at the moment (for example, fixed errors, random errors, or uncertainties).
Describing One Variable
Consider a variable X_i, which has a known uncertainty dX_i. The form for representing this variable and its uncertainty is
X=X_i(measured) (+/-)dX_i (20:1)
This statement should be interpreted to mean the following:
* The best estimate of X, is X_i (measured)
* There is an uncertainty in X_i that may be as large as (+/-)dX_i
* The odds are 20 to 1 against the uncertainty of X_i being larger than (+/-)dX_i.
The value of dX_i represents 2-sigma for a single-sample analysis, where sigma is the standard deviation of the population of possible measurements from which the single sample X_i was taken.
The uncertainty (+/-)dX_i Moffat described, exactly represents the (+/-)4W/m^2 LWCF calibration error statistic derived from the combined individual model errors in the test simulations of 27 CMIP5 climate models.
For multiple-sample experiments, dX_i can have three meanings. It may represent tS_(N)/(sqrtN) for random error components, where S_(N) is the standard deviation of the set of N observations used to calculate the mean value (X_i)_bar and t is the Student’s t-statistic appropriate for the number of samples N and the confidence level desired. It may represent the bias limit for fixed errors (this interpretation implicitly requires that the bias limit be estimated at 20:1 odds). Finally, dX_i may represent U_95, the overall uncertainty in X_i.
From the “basic mathematics” section above, the over-all uncertainty U = root-sum-square = sqrt[sum over((+/-)dX_i)^2] = the root-sum-square of errors (rmse). That is U = sqrt[(sum over(+/-)dX_i)^2] = (+/-)rmse.
The result R of the experiment is assumed to be calculated from a set of measurements using a data interpretation program (by hand or by computer) represented by
R = R(X_1,X_2,X_3,…, X_N)
The objective is to express the uncertainty in the calculated result at the same odds as were used in estimating the uncertainties in the measurements.
The effect of the uncertainty in a single measurement on the calculated result, if only that one measurement were in error would be
dR_x_i = (dR/dX_i)*dX_i)
When several independent variables are used in the function R, the individual terms are combined by a root-sum-square method.
dR = sqrt[sum over(dR/dX_i)*dX_i)^2]
This is the basic equation of uncertainty analysis. Each term represents the contribution made by the uncertainty in one variable, dX_i, to the overall uncertainty in the result, dR.
http://www.sciencedirect.com/science/article/pii/089417778890043X
Vasquez VR, Whiting WB. Accounting for Both Random Errors and Systematic Errors in Uncertainty Propagation Analysis of Computer Models Involving Experimental Measurements with Monte Carlo Methods. Risk Analysis. 2006;25(6):1669-81.
[S]ystematic errors are associated with calibration bias in the methods and equipment used to obtain the properties. Experimentalists have paid significant attention to the effect of random errors on uncertainty propagation in chemical and physical property estimation. However, even though the concept of systematic error is clear, there is a surprising paucity of methodologies to deal with the propagation analysis of systematic errors. The effect of the latter can be more significant than usually expected.
Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources.
Of particular interest are the effects of possible calibration errors in experimental measurements. The results are analyzed through the use of cumulative probability distributions (cdf) for the output variables of the model.”
A good general definition of systematic uncertainty is the difference between the observed mean and the true value.”
Also, when dealing with systematic errors we found from experimental evidence that in most of the cases it is not practical to define constant bias backgrounds. As noted by Vasquez and Whiting (1998) in the analysis of thermodynamic data, the systematic errors detected are not constant and tend to be a function of the magnitude of the variables measured.”
Additionally, random errors can cause other types of bias effects on output variables of computer models. For example, Faber et al. (1995a, 1995b) pointed out that random errors produce skewed distributions of estimated quantities in nonlinear models. Only for linear transformation of the data will the random errors cancel out.”
Although the mean of the cdf for the random errors is a good estimate for the unknown true value of the output variable from the probabilistic standpoint, this is not the case for the cdf obtained for the systematic effects, where any value on that distribution can be the unknown true. The knowledge of the cdf width in the case of systematic errors becomes very important for decision making (even more so than for the case of random error effects) because of the difficulty in estimating which is the unknown true output value. (emphasisi in original)”
It is important to note that when dealing with nonlinear models, equations such as Equation (2) will not estimate appropriately the effect of combined errors because of the nonlinear transformations performed by the model.
Equation (2) is the standard uncertainty propagation sqrt[sum over(±sys error statistic)^2].
In principle, under well-designed experiments, with appropriate measurement techniques, one can expect that the mean reported for a given experimental condition corresponds truly to the physical mean of such condition, but unfortunately this is not the case under the presence of unaccounted systematic errors.
When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:
beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i. Similarly, the same approach is used to define a total random error based on individual standard deviation estimates,
e_k = sqrt[sum over(sigma_R_i)^2]
A similar approach for including both random and bias errors in one fterm is presented by Deitrich (1991) with minor variations, from a conceptual standpoint, from the one presented by ANSI/ASME (1998)
http://dx.doi.org/10.1111/j.1539-6924.2005.00704.x
Kline SJ. The Purposes of Uncertainty Analysis. Journal of Fluids Engineering. 1985;107(2):153-60.
The Concept of Uncertainty
Since no measurement is perfectly accurate, means for describing inaccuracies are needed. It is now generally agreed that the appropriate concept for expressing inaccuracies is an “uncertainty” and that the value should be provided by an “uncertainty analysis.”
An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable.
The term “calibration experiment” is used in this paper to denote an experiment which: (i) calibrates an instrument or a thermophysical property against established standards; (ii) measures the desired output directly as a measurand so that propagation of uncertainty is unnecessary.
The information transmitted from calibration experiments into a complete engineering experiment on engineering systems or a record experiment on engineering research needs to be in a form that can be used in appropriate propagation processes (my bold). … Uncertainty analysis is the sine qua non for record experiments and for systematic reduction of errors in experimental work.
Uncertainty analysis is … an additional powerful cross-check and procedure for ensuring that requisite accuracy is actually obtained with minimum cost and time.
Propagation of Uncertainties Into Results
In calibration experiments, one measures the desired result directly. No problem of propagation of uncertainty then arises; we have the desired results in hand once we complete measurements. In nearly all other experiments, it is necessary to compute the uncertainty in the results from the estimates of uncertainty in the measurands. This computation process is called “propagation of uncertainty.”
Let R be a result computed from n measurands x_1, … x_n„ and W denotes an uncertainty with the subscript indicating the variable. Then, in dimensional form, we obtain: (W_R = sqrt[sum over(error_i)^2]).”
https://doi.org/10.1115/1.3242449
Henrion M, Fischhoff B. Assessing uncertainty in physical constants. American Journal of Physics. 1986;54(9):791-8.
“Error” is the actual difference between a measurement and the value of the quantity it is intended to measure, and is generally unknown at the time of measurement. “Uncertainty” is a scientist’s assessment of the probably magnitude of that error.
https://aapt.scitation.org/doi/abs/10.1119/1.14447
Could someone send all of this to Greta ? I know she will be devastated, but it might do her good to see what a real scientific discussion looks like? Thank you.
“Uncertainty is a measure of ignorance.”
That’s what I was trying to get to in my comments on Dr. Spencer’s site regarding “The Pause”. The Pause was not predicted by the models and Dr. Spencer believes it was an “internally generated error” (which it might be) but I tend to think it was the result of ignorance about the underlying science base for climate. So, statistical error or ignorance?
But, if I said that we know 10% of the underlying science in System A and 50% of the science in System B, which one would have the greatest uncertainty going forward?
I’m out over my skis. I should shut up.
wonderful! But 2 things 1) Until the planetary temperatures start to cool to below the 30 year running mean, they have carte blanche to exaggerate their point 2) More problematic is the fact that unless the earth turns into the Garden of Eden, they will weaponize any weather event they can get their hands on and a willing public will accept it
But this a wonderful tour de force of reasoning that comes naturally to someone wishing to look at this issue with an open mind, Outstanding and thank you!~
This illustration might clarify the meaning of (+/-)4 W/m^2 of uncertainty in annual average LWCF.
The question to be addressed is what accuracy is necessary in simulated cloud fraction to resolve the annual impact of CO2 forcing?
We know from Lauer and Hamilton that the average CMIP5 (+/-)12.1% annual cloud fraction (CF) error produces an annual average (+/-)4 W/m^2 error in long wave cloud forcing (LWCF).
We also know that the annual average increase in CO2 forcing is about 0.035 W/m^2.
Assuming a linear relationship between cloud fraction error and LWCF error, the (+/-)12.1% CF error is proportionately responsible for (+/-)4 W/m^2 annual average LWCF error.
Then one can estimate the level of resolution necessary to reveal the annual average cloud fraction response to CO2 forcing as, (0.035 W/m^2/(+/-)4 W/m^2)*(+/-)12.1% cloud fraction = 0.11% change in cloud fraction.
This indicates that a climate model needs to be able to accurately simulate a 0.11% feedback response in cloud fraction to resolve the annual impact of CO2 emissions on the climate.
That is, the cloud feedback to a 0.035 W/m^2 annual CO2 forcing needs to be known, and able to be simulated, to a resolution of 0.11% in CF in order to know how clouds respond to annual CO2 forcing.
Alternatively, we know the total tropospheric cloud feedback effect is about -25 W/m^2. This is the cumulative influence of 67% global cloud fraction.
The annual tropospheric CO2 forcing is, again, about 0.035 W/m^2. The CF equivalent that produces this feedback energy flux is again linearly estimated as (0.035 W/m^2/25 W/m^2)*67% = 0.094%.
Assuming the linear relations are reasonable, both methods indicate that the model resolution needed to accurately simulate the annual cloud feedback response of the climate, to an annual 0.035 W/m^2 of CO2 forcing, is about 0.1% CF.
To achieve that level of resolution, the model must accurately simulate cloud type, cloud distribution and cloud height, as well as precipitation and tropical thunderstorms.
This analysis illustrates the meaning of the (+/-)4 W/m^2 LWCF error. That error indicates the overall level of ignorance concerning cloud response and feedback.
The CF ignorance is such that tropospheric thermal energy flux is never known to better than (+/-)4 W/m^2. This is true whether forcing from CO2 emissions is present or not.
GCMs cannot simulate cloud response to 0.1% accuracy. It is not possible to simulate how clouds will respond to CO2 forcing.
It is therefore not possible to simulate the effect of CO2 emissions, if any, on air temperature.
As the model steps through the projection, our knowledge of the consequent global CF steadily diminishes because a GCM cannot simulate the global cloud response to CO2 forcing, and thus cloud feedback, at all for any step.
It is true in every step of a simulation. And it means that projection uncertainty compounds because every erroneous intermediate climate state is subjected to further simulation error.
This is why the uncertainty in projected air temperature increases so dramatically. The model is step-by-step walking away from initial value knowledge further and further into ignorance.
On an annual average basis, the uncertainty in CF feedback is (+/-)144 times larger than the perturbation to be resolved.
The CF response is so poorly known, that even the first simulation step enters terra incognita.