Guest essay by Pat Frank
For going on two years now, I’ve been trying to publish a manuscript that critically assesses the reliability of climate model projections. The manuscript has been submitted twice and rejected twice from two leading climate journals, for a total of four rejections. All on the advice of nine of ten reviewers. More on that below.
The analysis propagates climate model error through global air temperature projections, using a formalized version of the “passive warming model” (PWM) GCM emulator reported in my 2008 Skeptic article. Propagation of error through a GCM temperature projection reveals its predictive reliability.
Those interested can consult the invited poster (2.9 MB pdf) I presented at the 2013 AGU Fall Meeting in San Francisco. Error propagation is a standard way to assess the reliability of an experimental result or a model prediction. However, climate models are never assessed this way.
Here’s an illustration: the Figure below shows what happens when the average ±4 Wm-2 long-wave cloud forcing error of CMIP5 climate models , is propagated through a couple of Community Climate System Model 4 (CCSM4) global air temperature projections.
In panel a, the points show the CCSM4 anomaly projections of the AR5 Representative Concentration Pathways (RCP) 6.0 (green) and 8.5 (blue). The lines are the PWM emulations of the CCSM4 projections, made using the standard RCP forcings from Meinshausen.  The CCSM4 RCP forcings may not be identical to the Meinhausen RCP forcings. The shaded areas are the range of projections across all AR5 models (see AR5 Figure TS.15). The CCSM4 projections are in the upper range.
In panel b, the lines are the same two CCSM4 RCP projections. But now the shaded areas are the uncertainty envelopes resulting when ±4 Wm-2 CMIP5 long wave cloud forcing error is propagated through the projections in annual steps.
The uncertainty is so large because ±4 W m-2 of annual long wave cloud forcing error is ±114´ larger than the annual average 0.035 Wm-2 forcing increase of GHG emissions since 1979. Typical error bars for CMIP5 climate model projections are about ±14 C after 100 years and ±18 C after 150 years.
It’s immediately clear that climate models are unable to resolve any thermal effect of greenhouse gas emissions or tell us anything about future air temperatures. It’s impossible that climate models can ever have resolved an anthropogenic greenhouse signal; not now nor at any time in the past.
Propagation of errors through a calculation is a simple idea. It’s logically obvious. It’s critically important. It gets pounded into every single freshman physics, chemistry, and engineering student.
And it has escaped the grasp of every single Ph.D. climate modeler I have encountered, in conversation or in review.
That brings me to the reason I’m writing here. My manuscript has been rejected four times; twice each from two high-ranking climate journals. I have responded to a total of ten reviews.
Nine of the ten reviews were clearly written by climate modelers, were uniformly negative, and recommended rejection. One reviewer was clearly not a climate modeler. That one recommended publication.
I’ve had my share of scientific debates. A couple of them not entirely amiable. My research (with colleagues) has over-thrown four ‘ruling paradigms,’ and so I’m familiar with how scientists behave when they’re challenged. None of that prepared me for the standards at play in climate science.
I’ll start with the conclusion, and follow on with the supporting evidence: never, in all my experience with peer-reviewed publishing, have I ever encountered such incompetence in a reviewer. Much less incompetence evidently common to a class of reviewers.
The shocking lack of competence I encountered made public exposure a civic corrective good.
Physical error analysis is critical to all of science, especially experimental physical science. It is not too much to call it central.
Result ± error tells what one knows. If the error is larger than the result, one doesn’t know anything. Geoff Sherrington has been eloquent about the hazards and trickiness of experimental error.
All of the physical sciences hew to these standards. Physical scientists are bound by them.
Climate modelers do not and by their lights are not.
I will give examples of all of the following concerning climate modelers:
- They neither respect nor understand the distinction between accuracy and precision.
- They understand nothing of the meaning or method of propagated error.
- They think physical error bars mean the model itself is oscillating between the uncertainty extremes. (I kid you not.)
- They don’t understand the meaning of physical error.
- They don’t understand the importance of a unique result.
Bottom line? Climate modelers are not scientists. Climate modeling is not a branch of physical science. Climate modelers are unequipped to evaluate the physical reliability of their own models.
The incredibleness that follows is verbatim reviewer transcript; quoted in italics. Every idea below is presented as the reviewer meant it. No quotes are contextually deprived, and none has been truncated into something different than the reviewer meant.
And keep in mind that these are arguments that certain editors of certain high-ranking climate journals found persuasive.
1. Accuracy vs. Precision
The distinction between accuracy and precision is central to the argument presented in the manuscript, and is defined right in the Introduction.
The accuracy of a model is the difference between its predictions and the corresponding observations.
The precision of a model is the variance of its predictions, without reference to observations.
Physical evaluation of a model requires an accuracy metric.
There is nothing more basic to science itself than the critical distinction of accuracy from precision.
Here’s what climate modelers say:
“Too much of this paper consists of philosophical rants (e.g., accuracy vs. precision) …”
“[T]he author thinks that a probability distribution function (pdf) only provides information about precision and it cannot give any information about accuracy. This is wrong, and if this were true, the statisticians could resign.”
“The best way to test the errors of the GCMs is to run numerical experiments to sample the predicted effects of different parameters…”
“The author is simply asserting that uncertainties in published estimates [i.e., model precision – P] are not ‘physically valid’ [i.e., not accuracy – P]- an opinion that is not widely shared.”
Not widely shared among climate modelers, anyway.
The first reviewer actually scorned the distinction between accuracy and precision. This, from a supposed scientist.
The remainder are alternative declarations that model variance, i.e., precision, = physical accuracy.
The accuracy-precision difference was extensively documented to relevant literature in the manuscript, e.g., [3, 4].
The reviewers ignored that literature. The final reviewer dismissed it as mere assertion.
Every climate modeler reviewer who addressed the precision-accuracy question similarly failed to grasp it. I have yet to encounter one who understands it.
2. No understanding of propagated error
“The authors claim that published projections do not include ‘propagated errors’ is fundamentally flawed. It is clearly the case that the model ensemble may have structural errors that bias the projections.”
I.e., the reviewer supposes that model precision = propagated error.
“The repeated statement that no prior papers have discussed propagated error in GCM projections is simply wrong (Rogelj (2013), Murphy (2007), Rowlands (2012)).”
Let’s take the reviewer examples in order:
Rogelj (2013) concerns the economic costs of mitigation. Their Figure 1b includes a global temperature projection plus uncertainty ranges. The uncertainties, “are based on a 600-member ensemble of temperature projections for each scenario…” 
I.e., the reviewer supposes that model precision = propagated error.
Murphy (2007) write, “In order to sample the effects of model error, it is necessary to construct ensembles which sample plausible alternative representations of earth system processes.” 
I.e., the reviewer supposes that model precision = propagated error.
Rowlands (2012) write, “Here we present results from a multi-thousand-member perturbed-physics ensemble of transient coupled atmosphere–ocean general circulation model simulations. “ and go on to state that, “Perturbed-physics ensembles offer a systematic approach to quantify uncertainty in models of the climate system response to external forcing, albeit within a given model structure.” 
I.e., the reviewer supposes that model precision = propagated error.
Not one of this reviewer’s examples of propagated error includes any propagated error, or even mentions propagated error.
Not only that, but not one of the examples discusses physical error at all. It’s all model precision.
This reviewer doesn’t know what propagated error is, what it means, or how to identify it. This reviewer also evidently does not know how to recognize physical error itself.
“Examples of uncertainty propagation: Stainforth, D. et al., 2005: Uncertainty in predictions of the climate response to rising levels of greenhouse gases. Nature 433, 403-406.
“M. Collins, R. E. Chandler, P. M. Cox, J. M. Huthnance, J. Rougier and D. B. Stephenson, 2012: Quantifying future climate change. Nature Climate Change, 2, 403-409.”
Let’s find out: Stainforth (2005) includes three Figures; Every single one of them presents error as projection variation. 
Here’s their Figure 1:
Original Figure Legend: “Figure 1 Frequency distributions of T g (colours indicate density of trajectories per 0.1 K interval) through the three phases of the simulation. a, Frequency distribution of the 2,017 distinct independent simulations. b, Frequency distribution of the 414 model versions. In b, T g is shown relative to the value at the end of the calibration phase and where initial condition ensemble members exist, their mean has been taken for each time point.”
Here’s what they say about uncertainty: “[W]e have carried out a grand ensemble (an ensemble of ensembles) exploring uncertainty in a state-of-the-art model. Uncertainty in model response is investigated using a perturbed physics ensemble in which model parameters are set to alternative values considered plausible by experts in the relevant parameterization schemes.”
There it is: uncertainty is directly represented as model variability (density of trajectories; perturbed physics ensemble).
The remaining figures in Stainforth (2005) derive from this one. Propagated error appears nowhere and is nowhere mentioned.
Reviewer supposition: model precision = propagated error.
Collins (2012) state that adjusting model parameters so that projections approach observations is enough to “hope” that a model has physical validity. Propagation of error is never mentioned. Collins Figure 3 shows physical uncertainty as model variability about an ensemble mean.  Here it is:
Original Legend: “Figure 3 | Global temperature anomalies. a, Global mean temperature anomalies produced using an EBM forced by historical changes in well-mixed greenhouse gases and future increases based on the A1B scenario from the Intergovernmental Panel on Climate Change’s Special Report on Emission Scenarios. The different curves are generated by varying the feedback parameter (climate sensitivity) in the EBM. b, Changes in global mean temperature at 2050 versus global mean temperature at the year 2000, … The histogram on the x axis represents an estimate of the twentieth-century warming attributable to greenhouse gases. The histogram on the y axis uses the relationship between the past and the future to obtain a projection of future changes.”
Collins 2012, part a: model variability itself; part b: model variability (precision) represented as physical uncertainty (accuracy). Propagated error? Nowhere to be found.
So, once again, not one of this reviewer’s examples of propagated error actually includes any propagated error, or even mentions propagated error.
It’s safe to conclude that these climate modelers have no concept at all of propagated error. They apparently have no concept whatever of physical error.
Every single time any of the reviewers addressed propagated error, they revealed a complete ignorance of it.
3. Error bars mean model oscillation – wherein climate modelers reveal a fatal case of naive-freshman-itis.
“To say that this error indicates that temperatures could hugely cool in response to CO2 shows that their model is unphysical.”
“[T]his analysis would predict that the models will swing ever more wildly between snowball and runaway greenhouse states.”
“Indeed if we carry such error propagation out for millennia we find that the uncertainty will eventually be larger than the absolute temperature of the Earth, a clear absurdity.”
“An entirely equivalent argument [to the error bars] would be to say (accurately) that there is a 2K range of pre-industrial absolute temperatures in GCMs, and therefore the global mean temperature is liable to jump 2K at any time – which is clearly nonsense…”
Got that? These climate modelers think that “±” error bars imply the model itself is oscillating (liable to jump) between the error bar extremes.
Or that the bars from propagated error represent physical temperature itself.
No sophomore in physics, chemistry, or engineering would make such an ignorant mistake.
But Ph.D. climate modelers have invariably done. One climate modeler audience member did so verbally, during Q&A after my seminar on this analysis.
The worst of it is that both the manuscript and the supporting information document explained that error bars represent an ignorance width. Not one of these Ph.D. reviewers gave any evidence of having read any of it.
5. Unique Result – a concept unknown among climate modelers.
Do climate modelers understand the meaning and importance of a unique result?
“[L]ooking the last glacial maximum, the same models produce global mean changes of between 4 and 6 degrees colder than the pre-industrial. If the conclusions of this paper were correct, this spread (being so much smaller than the estimated errors of +/- 15 deg C) would be nothing short of miraculous.”
“In reality climate models have been tested on multicentennial time scales against paleoclimate data (see the most recent PMIP intercomparisons) and do reasonably well at simulating small Holocene climate variations, and even glacial-interglacial transitions. This is completely incompatible with the claimed results.”
“The most obvious indication that the error framework and the emulation framework
presented in this manuscript is wrong is that the different GCMs with well-known different cloudiness biases (IPCC) produce quite similar results, albeit a spread in the
Let’s look at where these reviewers get such confidence. Here’s an example from Rowlands, (2012) of what models produce. 
Original Legend: “Figure 1 | Evolution of uncertainties in reconstructed global-mean temperature projections under SRES A1B in the HadCM3L ensemble.” 
The variable black line in the middle of the group represents the observed air temperature. I added the horizontal black lines at 1 K and 3 K, and the vertical red line at year 2055. Part of the red line is in the original figure, as the precision uncertainty bar.
This Figure displays thousands of perturbed physics simulations of global air temperatures. “Perturbed physics” means that model parameters are varied across their range of physical uncertainty. Each member of the ensemble is of equivalent weight. None of them are known to be physically more correct than any of the others.
The physical energy-state of the simulated climate varies systematically across the years. The horizontal black lines show that multiple physical energy states produce the same simulated 1 K or 3 K anomaly temperature.
The vertical red line at year 2055 shows that the identical physical energy-state (the year 2055 state) produces multiple simulated air temperatures.
These wandering projections do not represent natural variability. They represent how parameter magnitudes varied across their uncertainty ranges affect the temperature simulations of the HadCM3L model itself.
The Figure fully demonstrates that climate models are incapable of producing a unique solution to any climate energy-state.
That means simulations close to observations are not known to accurately represent the true physical energy-state of the climate. They just happen to have opportunistically wonderful off-setting errors.
That means, in turn, the projections have no informational value. They tell us nothing about possible future air temperatures.
There is no way to know which of the simulations actually represents the correct underlying physics. Or whether any of them do. And even if one of them happens to conform to the future behavior of the climate, there’s no way to know it wasn’t a fortuitous accident.
Models with large parameter uncertainties can not produce a unique prediction. The reviewers’ confident statements show they have no understanding of that, or of why it’s important.
Now suppose Rowlands, et al., tuned the parameters of the HADCM3L model so that it precisely reproduced the observed air temperature line.
Would it mean the HADCM3L had suddenly attained the ability to produce a unique solution to the climate energy-state?
Would it mean the HADCM3L was suddenly able to reproduce the correct underlying physics?
Tuned parameters merely obscure uncertainty. They hide the unreliability of the model. It is no measure of accuracy that tuned models produce similar projections. Or that their projections are close to observations. Tuning parameter sets merely off-sets errors and produces a false and tendentious precision.
Every single recent, Holocene, or Glacial-era temperature hindcast is likewise non-unique. Not one of them validate the accuracy of a climate model. Not one of them tell us anything about any physically real global climate state. Not one single climate modeler reviewer evidenced any understanding of that basic standard of science.
Any physical scientist would (should) know this. The climate modeler reviewers uniformly do not.
6. An especially egregious example in which the petard self-hoister is unaware of the air underfoot.
Finally, I’d like to present one last example. The essay is already long, and yet another instance may be overkill.
But I finally decided it is better to risk reader fatigue than to not make a public record of what passes for analytical thinking among climate modelers. Apologies if it’s all become tedious.
This last truly demonstrates the abysmal understanding of error analysis at large in the ranks of climate modelers. Here we go:
“I will give (again) one simple example of why this whole exercise is a waste of time. Take a simple energy balance model, solar in, long wave out, single layer atmosphere, albedo and greenhouse effect. i.e. sigma Ts^4 = S (1-a) /(1 -lambda/2) where lambda is the atmospheric emissivity, a is the albedo (0.7), S the incident solar flux (340 W/m^2), sigma is the SB coefficient and Ts is the surface temperature (288K).
“The sensitivity of this model to an increase in lambda of 0.02 (which gives a 4 W/m2 forcing) is 1.19 deg C (assuming no feedbacks on lambda or a). The sensitivity of an erroneous model with an error in the albedo of 0.012 (which gives a 4 W/m^2 SW TOA flux error) to exactly the same forcing is 1.18 deg C.
“This the difference that a systematic bias makes to the sensitivity is two orders of magnitude less than the effect of the perturbation. The author’s equating of the response error to the bias error even in such a simple model is orders of magnitude wrong. It is exactly the same with his GCM emulator.”
The “difference” the reviewer is talking about is 1.19 C – 1.18 C = 0.01 C. The reviewer supposes that this 0.01 C is the entire uncertainty produced by the model due to a 4 Wm-2 offset error in either albedo or emissivity.
But it’s not.
First reviewer mistake: If 1.19 C or 1.18 C are produced by a 4 Wm-2 offset forcing error, then 1.19 C or 1.18 C are offset temperature errors. Not sensitivities. Their tiny difference, if anything, confirms the error magnitude.
Second mistake: The reviewer doesn’t know the difference between an offset error (a statistic) and temperature (a thermodynamic magnitude). The reviewer’s “sensitivity” is actually “error.”
Third mistake: The reviewer equates a 4 W/m2 energetic perturbation to a ±4 W/m2 physical error statistic.
This mistake, by the way, again shows that the reviewer doesn’t know to make a distinction between a physical magnitude and an error statistic.
Fourth mistake: The reviewer compares a single step “sensitivity” calculation to multi-step propagated error.
Fifth mistake: The reviewer is apparently unfamiliar with the generality that physical uncertainties express a bounded range of ignorance; i.e., “±” about some value. Uncertainties are never constant offsets.
Lemma to five: the reviewer apparently also does not know the correct way to express the uncertainties is ±lambda or ±albedo.
But then, inconveniently for the reviewer, if the uncertainties are correctly expressed, the prescribed uncertainty is ±4 W/m2 in forcing. The uncertainty is then obviously an error statistic and not an energetic malapropism.
For those confused by this distinction, no energetic perturbation can be simultaneously positive and negative. Earth to modelers, over. . .
When the reviewer’s example is expressed using the correct ± statistical notation, 1.19 C and 1.18 C become ±1.19 C and ±1.18 C.
And these are uncertainties for a single step calculation. They are in the same ballpark as the single-step uncertainties presented in the manuscript.
As soon as the reviewer’s forcing uncertainty enters into a multi-step linear extrapolation, i.e., a GCM projection, the ±1.19 C and ±1.18 C uncertainties would appear in every step, and must then propagate through the steps as the root-sum-square. [3, 10]
After 100 steps (a centennial projection) ±1.18 C per step propagates to ±11.8 C.
So, correctly done, the reviewer’s own analysis validates the very manuscript that the reviewer called a “waste of time.” Good job, that.
- doesn’t know the meaning of physical uncertainty.
- doesn’t distinguish between model response (sensitivity) and model error. This mistake amounts to not knowing to distinguish between an energetic perturbation and a physical error statistic.
- doesn’t know how to express a physical uncertainty.
- and doesn’t know the difference between single step error and propagated error.
So, once again, climate modelers:
- neither respect nor understand the distinction between accuracy and precision.
- are entirely ignorant of propagated error.
- think the ± bars of propagated error mean the model itself is oscillating.
- have no understanding of physical error.
- have no understanding of the importance or meaning of a unique result.
No working physical scientist would fall for any one of those mistakes, much less all of them. But climate modelers do.
And this long essay does not exhaust the multitude of really basic mistakes in scientific thinking these reviewers made.
Apparently, such thinking is critically convincing to certain journal editors.
Given all this, one can understand why climate science has fallen into such a sorry state. Without the constraint of observational physics, it’s open season on finding significations wherever one likes and granting indulgence in science to the loopy academic theorizing so rife in the humanities. 
When mere internal precision and fuzzy axiomatics rule a field, terms like consistent with, implies, might, could, possible, likely, carry definitive weight. All are freely available and attachable to pretty much whatever strikes one’s fancy. Just construct your argument to be consistent with the consensus. This is known to happen regularly in climate studies, with special mentions here, here, and here.
One detects an explanation for why political sentimentalists like Naomi Oreskes and Naomi Klein find climate alarm so homey. It is so very opportune to polemics and mindless righteousness. (What is it about people named Naomi, anyway? Are there any tough-minded skeptical Naomis out there? Post here. Let us know.)
In their rejection of accuracy and fixation on precision, climate modelers have sealed their field away from the ruthless indifference of physical evidence, thereby short-circuiting the critical judgment of science.
Climate modeling has left science. It has become a liberal art expressed in mathematics. Call it equationized loopiness.
The inescapable conclusion is that climate modelers are not scientists. They don’t think like scientists, they are not doing science. They have no idea how to evaluate the physical validity of their own models.
They should be nowhere near important discussions or decisions concerning science-based social or civil policies.
1. Lauer, A. and K. Hamilton, Simulating Clouds with Global Climate Models: A Comparison of CMIP5 Results with CMIP3 and Satellite Data. J. Climate, 2013. 26(11): p. 3823-3845.
2. Meinshausen, M., et al., The RCP greenhouse gas concentrations and their extensions from 1765 to 2300. Climatic Change, 2011. 109(1-2): p. 213-241.
The PWM coefficients for the CCSM4 emulations were: RCP 6.0 fCO₂ = 0.644, a = 22.76 C; RCP 8.5, fCO₂ = 0.651, a = 23.10 C.
3. JCGM, Evaluation of measurement data — Guide to the expression of uncertainty in measurement. 100:2008, Bureau International des Poids et Mesures: Sevres, France.
4. Roy, C.J. and W.L. Oberkampf, A comprehensive framework for verification, validation, and uncertainty quantification in scientific computing. Comput. Methods Appl. Mech. Engineer., 2011. 200(25-28): p. 2131-2144.
5. Rogelj, J., et al., Probabilistic cost estimates for climate change mitigation. Nature, 2013. 493(7430): p. 79-83.
6. Murphy, J.M., et al., A methodology for probabilistic predictions of regional climate change from perturbed physics ensembles. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2007. 365(1857): p. 1993-2028.
7. Rowlands, D.J., et al., Broad range of 2050 warming from an observationally constrained large climate model ensemble. Nature Geosci, 2012. 5(4): p. 256-260.
8. Stainforth, D.A., et al., Uncertainty in predictions of the climate response to rising levels of greenhouse gases. Nature, 2005. 433(7024): p. 403-406.
9. Collins, M., et al., Quantifying future climate change. Nature Clim. Change, 2012. 2(6): p. 403-409.
10. Bevington, P.R. and D.K. Robinson, Data Reduction and Error Analysis for the Physical Sciences. 3rd ed. 2003, Boston: McGraw-Hill. 320.
11. Gross, P.R. and N. Levitt, Higher Superstition: The Academic Left and its Quarrels with Science. 1994, Baltimore, MD: Johns Hopkins University. May be the most intellectually enjoyable book, ever.