Commentary from Nature Climate Change, by John C. Fyfe, Nathan P. Gillett, & Francis W. Zwiers
Recent observed global warming is significantly less than that simulated by climate models. This difference might be explained by some combination of errors in external forcing, model response and internal climate variability.
Global mean surface temperature over the past 20 years (1993–2012) rose at a rate of 0.14 ± 0.06 °C per decade (95% confidence interval)1. This rate of warming is significantly slower than that simulated by the climate models participating in Phase 5 of the Coupled Model Intercomparison Project (CMIP5). To illustrate this, we considered trends in global mean surface temperature computed from 117 simulations of the climate by 37 CMIP5
models (see Supplementary Information).
These models generally simulate natural variability — including that associated
with the El Niño–Southern Oscillation and explosive volcanic eruptions — as
well as estimate the combined response of climate to changes in greenhouse gas
concentrations, aerosol abundance (of sulphate, black carbon and organic carbon,
for example), ozone concentrations (tropospheric and stratospheric), land
use (for example, deforestation) and solar variability. By averaging simulated
temperatures only at locations where corresponding observations exist, we find
an average simulated rise in global mean surface temperature of 0.30 ± 0.02 °C
per decade (using 95% confidence intervals on the model average). The
observed rate of warming given above is less than half of this simulated rate, and
only a few simulations provide warming trends within the range of observational
uncertainty (Fig. 1a).
Figure 1 | Trends in global mean surface temperature. a, 1993–2012. b, 1998–2012. Histograms of observed trends (red hatching) are from 100 reconstructions of the HadCRUT4 dataset1. Histograms of model trends (grey bars) are based on 117 simulations of the models, and black curves are smoothed versions of the model trends. The ranges of observed trends reflect observational uncertainty, whereas the ranges of model trends reflect forcing uncertainty, as well as differences in individual model responses to external forcings and uncertainty arising from internal climate variability.
The inconsistency between observed and simulated global warming is even more
striking for temperature trends computed over the past fifteen years (1998–2012).
For this period, the observed trend of 0.05 ± 0.08 °C per decade is more than four
times smaller than the average simulated trend of 0.21 ± 0.03 °C per decade (Fig. 1b).
It is worth noting that the observed trend over this period — not significantly
different from zero — suggests a temporary ‘hiatus’ in global warming. The
divergence between observed and CMIP5-simulated global warming begins in the
early 1990s, as can be seen when comparing observed and simulated running trends
from 1970–2012 (Fig. 2a and 2b for 20-year and 15-year running trends, respectively).
The evidence, therefore, indicates that the current generation of climate models
(when run as a group, with the CMIP5 prescribed forcings) do not reproduce
the observed global warming over the past 20 years, or the slowdown in global
warming over the past fifteen years.
This interpretation is supported by statistical tests of the null hypothesis that the
observed and model mean trends are equal, assuming that either: (1) the models are
exchangeable with each other (that is, the ‘truth plus error’ view); or (2) the models
are exchangeable with each other and with the observations (see Supplementary
Information).
Brief: http://www.pacificclimate.org/sites/default/files/publications/pcic_science_brief_FGZ.pdf
Paper at NCC: http://www.nature.com/nclimate/journal/v3/n9/full/nclimate1972.html?WT.ec_id=NCLIMATE-201309
- Supplementary Information (241 KB) CMIP5 Models

M Courtney says:
September 5, 2013 at 12:34 pm
You are correct. Please pardon me.
Theo Goodwin. No Worries, Sir.
But I really do agree with you when you imply that my father should focus more on the real issues rather than smashing everyone who is wrong on the internet.
He would get to bed earlier.
Richardscourtney: Thanks for the link. I’m sure this is obvious to you but it’s a PDF file and needs adobe acrobat reader to download. Other than that I can’t understand why you can’t access the link as it works fine for all my computers except tablets.
Theo Goodwin says:
September 5, 2013 at 12:36 pm
At least in the US, “nitpicking” has been in the language since the 1950s, if not before.
BBould:
Repeated attempts to download the file have each locked-up my computer so I have had to restart it.
As you suspected, I do have Adobe Acrobat and that loads before the problem arises. I notice from the header that the paper is 42 pages so I am wondering if the problem has something to do with the large file size.
If you cannot provide a reference for me so I can try to access it elsewhere, perhaps – as a start – you can copy its abstract to here so I can at least understand what you are asking about?
Sorry to be a nuisance about this.
Richard
I’ve said this on other threads, but it is especially true on this one. CIMP5 is an aggregation of independent models. There is no possible null hypothesis for such an aggregate, nor is the implied use of ordinary statistics in the analysis above axiomatically supportable.
GCMs are not independent, identically distributed samples.
Consequently, the central limit theorem has absolutely no defensible application to the mean or variance obtained for a single projective parameter extracted from an ensemble of GCMs. This is equally true in both directions. One cannot reject “CIMP5” per se or any of the participating models on the grounds of a hypothesis test based on an assumed normal distribution and the error function used to obtain a p-value, nor can one assert that the mean of this projective distribution and its variance enable any statement about how “likely” it is to have any given temperature in the underlying distribution.
What one can do is take each model in the collection, where each model typically produces a spread of outcomes for a Monte Carlo random perturbation of the initial conditions and parameters, analyze the mean of that spread and its statistical moments and properties, and compare the result to observation because in this case the Monte Carlo perturbation is indeed selection of random iid samples from a known distribution and hence both the central limit theorem applies and — given a Monte Carlo generated statistical envelope of model results — one doesn’t really need it. One can assess the p-value directly by comparing the actual trajectory to the ensemble of model generated trajectories even if the latter is not Gaussian.
It is this last step that is never done. Based on the spaghetti snarl of GCM results that I’ve seen in e.g. AR4 or 5 for specific models (compared to the actual temperature, most of them would individually fail a correctly implemented hypothesis test when compared to the data, where now there is a meaningful null hypothesis and p-value for each model separately (the hypothesis of the model itself being correct and the contingent probability of producing the observed result). Indeed, a lot of them would fail with a very high degree of confidence (very small p-values, well below an e.g. 0.05 cutoff).
If those models were removed from CIMP5, one would — at a guess, since I do not have access to the actual distribution of trajectories for all of the contributing models and have to generalize from the samples I’ve seen — one would give up pretty much all of the models to the right of the primary peak, by throwing them into the garbage can (especially the secondary and tertiary peaks, as the distribution isn’t even cleanly unimodal in figure b). I’m guessing that while one could not actively fail all of the ones in between that cutoff and reality, a lot of the remaining ones would have systematically poor p-values — low enough to ensure that they aren’t all right for sure without necessarily being able to tag specific ones as wrong.
Even this analysis is faulty, because executing the strategy above is a form of data dredging, and the criterion for passing the hypothesis test has to be accordingly strengthened because you have so many opportunities to pass the hypothesis test that it isn’t surprising that some models might seem to do so even though they truly fail as a statistical accident. This would knock off a few more. I’m guessing that by the time one was done with this process, one would have a much, much smaller set of models that survived the “I guess I need to reasonably agree with empirical observation, don’t I?” cut, and the model mean would be much, much closer to reality (and still irrelevant).
At that point, though, one could examine the moderately successful (or at least, “close”) models to see what features they have in common and to help craft the next generation of models. It would also be an excellent time to re-assess the input variables (thinking about omitted variables in particular) and consider backwards (Bayesian) inference of parameters like the sensitivity, finding a value of the sensitivity that (for example) makes the centroid of the Monte Carlo distribution agree with the observed record. At least, it would be an excellent time to do this sort of thing if the GCM owners were doing science instead of performing political theater. And some of them are! Climate sensitivity is in free-fall for precisely that reason, because the clearly evident warming bias in almost all the models has driven a lot of honest (and formerly honestly mistaken) researchers back to the drawing board to determine the highest the sensitivity could reasonably be expected to without making the p-value TOO small, a form of Bayesian analysis that perhaps overweights the the former prior. This IMO will only mean that they have to move it again in the future (sigh) but that is their call. That’s what the future is for, anyway — to validate the better and falsify the worser.
rgb
kadaka (KD Knoebel) says:
September 5, 2013 at 5:49 am (replying to )
gnomish said on September 5, 2013 at 2:59 am:
Referencing the problems in the experimental setup above (melting bowl, little effect, difficulty i setup and measurements).
Try it again, but with a much larger “area” of the water exposed to the heat/volume of water in total. That is, use a steel baking pan much larger than the IR heater area. That way, the IR heats heats the water under the heater, but the sides of the aluminum or steel pan are far away for the edges of the IR heater. If the water is as high as possible in the pan, then the pan edges will be both further away (increasing r^2 losses) and have less area exposed to the IR radiation coming the center of the IR heater.
I would argue with your proposal. Radiation enters water but physical heat does not.
M Courtney says:
September 5, 2013 at 11:53 am
“-The literal meaning (we all know what words mean)
Not so. Explaining, I think, the recent gaffe of Tony Abbott –
“No one — however smart, however well-educated, however experienced — is the suppository of all wisdom.” Mr Abbott appeared to mix up the word “suppository” with the word repository.
http://www.independent.co.uk/news/world/australasia/australian-election-tony-abbott-hits-bum-note-with-suppository-of-all-wisdom-gaffe-8757527.html
Now I return you to your regular programming.
richardscourtney –
I’ve been thinking for a long time now that those of us who are not satisfied with the science on climate change need to do some simple, strategic “marketing” to better represent ourselves. For example, we do not deny the climate changes-we’re the ones who usually point that out. We do not deny that it warms when it actually warms, nor do we deny that it cools when it cools etc. But we’ve allowed the “other side” to define us for so long, that even the general public accept their definitions of us. It has to stop. But we have to have other things to fill that void. Definitions and statements that reflect the truth. Things that are simple and solid and consistent and completely irrefutable that we just keep saying and saying and saying and saying until their definitions of us lose all traction.
With that said, I LOVED something you said earlier:
“The modellers built their climate models to represent their understandings of climate mechanisms. If their understandings were correct then the models would behave as the climate does. The fact that the climate models provide indications which are NOT “consistent with measurements” indicates that the understanding of climate mechanisms of the modellers is wrong (or, at least, the way they have modeled that understanding is in error).”
I wanted to ask you if I could use that statement-over and over and over again? (I’ll credit you every time if you wish.) If we, as those who are “skeptical about what is being called climate science” just state something like this over and over again, we drive home several truths all at once. If we could cause people in general to start asking themselves…and then others…:
“Is this study done with models?” or
“Why are scientists using models they know are wrong?” or
“Do the scientists even know their models are wrong?” or
“So how much of what we’re being told is based on EVIDENCE and FACT and how much is based on flawed models”?
…we’d be creating a whole world full of skeptics and critical thinkers. Imagine…..:)
==========================================================================
Well, I can think of a few for whom “suppository” would be the word to describe the source of some of their bits of “wisdom”.
Aphan:
re your question to me at September 5, 2013 at 2:04 pm.
Yes, of course you can use it if you want to. I would not have written it if I did not want people to read it.
And feel free to adopt it as your own if you want. I take great pleasure when I see phrases I invented or first applied long ago. And the pleasure is greatest when people tell me it is something I should hear (I smile inside and say nothing).
Richard
richardscourtney-
“And the pleasure is greatest when people tell me it is something I should hear (I smile inside and say nothing).”
LOL! You scamp you! I think that at least once, you should respond along the lines of “Why…that is a truly brilliant point!” If you do it within earshot of your son, you can tell him I put you up to it. 🙂
Richardscourtney: Here is the abstract.
This paper examines the standard Bayesian solution to the Quine-Duhem
problem, the problem of distributing blame between a theory and its auxiliary
hypotheses in the aftermath of a failed prediction. The standard
solution, I argue, begs the question against those who claim that the problem
has no solution. I then provide an alternative Bayesian solution that
is not question-begging and that turns out to have some interesting and
desirable properties not possessed by the standard solution. This solution
opens the way to a satisfying treatment of a problem concerning ad hoc
auxiliary hypotheses.
BBould:
In light of my difficulty downloading the file to comment on a paper as you requested, I write to say I now think you probably have an answer to the question you are really asking.
I think that answer is probably provided by considering the information in my post at September 5, 2013 at 8:09 am
http://wattsupwiththat.com/2013/09/05/statistical-proof-of-the-pause-overestimated-global-warming-over-the-past-20-years/#comment-1408595
together with the information in the short post from davidmhoffer at September 5, 2013 at 8:34 am
http://wattsupwiththat.com/2013/09/05/statistical-proof-of-the-pause-overestimated-global-warming-over-the-past-20-years/#comment-1408620
and the information in the long post from rgbatduke at September 5, 2013 at 1:46 pm
http://wattsupwiththat.com/2013/09/05/statistical-proof-of-the-pause-overestimated-global-warming-over-the-past-20-years/#comment-1408910
I especially commend study of the long post from Prof Brown, aka rgbatduke (see, I told you his comments are good).
Richard
Gail Combs says: September 5, 2013 at 7:14 am
Thanks for the reminder about the Horse Latitudes, hadn’t heard of them in, I guess, decades.
BBould:
Thankyou for your post addressed to me at September 5, 2013 at 2:39 pm. It came in while I was writing my post to you at September 5, 2013 at 2:39 pm.
OK. I see why you want me to read the paper: its abstract claims
Well, “those who claim that the problem has no solution” certainly includes me and I think includes Prof Brown, so I really do need to find a way to get at that paper.
Richard
BBould says:
September 5, 2013 at 2:39 pm
Richardscourtney: Here is the abstract.
“This paper examines the standard Bayesian solution to the Quine-Duhem
problem, the problem of distributing blame between a theory and its auxiliary
hypotheses in the aftermath of a failed prediction.”
Quine created the Duhem-Quine thesis but would have no patience for Bayesians. The Duhem-Quine thesis does not reference auxillary hypotheses. It applies to all the hypotheses in the theory and to the evidence for the theory.
“The standard
solution, I argue, begs the question against those who claim that the problem
has no solution. I then provide an alternative Bayesian solution that
is not question-begging and that turns out to have some interesting and
desirable properties not possessed by the standard solution. This solution
opens the way to a satisfying treatment of a problem concerning ad hoc
auxiliary hypotheses.”
This might be good work but it is not part-and-parcel of Quine’s work. Seems to me that it just takes Quine’s logic and adds to it. To those who pursued probabilities for various hypotheses, he remarked that he had no interest in colored marbles in an urn.
milodonharlani says:
September 5, 2013 at 1:26 pm
Do you have first hand evidence? I do not trust dictionaries on the matter of street language.
If you are right, I still claim that the nit was a creation of the Sixties. Do you remember nits?
rgbatduke says:
September 5, 2013 at 1:46 pm
Thanks so much for this very important work. Some among mainstream climate scientists who are usually willing to take skeptics seriously do not understand what you have just explained.
M Courtney says:
September 5, 2013 at 1:18 pm
When Lincoln referred to “the better angels of our nature” he was thinking of the first hour after a long, good night’s sleep.
Nitpick
http://www.etymonline.com/index.php?allowed_in_frame=0&search=nitpicker&searchmode=none
rgbatduke says:
September 5, 2013 at 1:46 pm
OK. So check me here, and see if I have interpreted what you wrote correctly.
You CANNOT add all of the GCM outputs together and “average” them together for each year because they are not independently measured properties subject to statistical theory. That is, if they really were accurate computer models of an accurately modeled physical process, EVERY run of the same model parameters would be identical: A calculator does NOT add 2+2+2 = 6 differently every time. Further, this future temperature vs CO2 growth is NOT a biologically statistical values like a tree height and trunk diameter: you CANNOT measure a lot of them and get a “more accurate” diameter or average height because these are not models based on “average growth rates per ton of fertilizer or per thousand gallons of water” right?
However, since model inputs DO vary statistically because of their Monte Carlo internal calculators, it IS a valid comparison to run each model separately several thousand times, then compare THAT “average” model output to itself to see if it is putting out random or valid predictions. (Ignore Oldberg for the rest of this, OK?) At this point, one should compare the 24-odd average model runs against real world (no volcanos since 1993, known aerosol changes, known ENSO and PDO changes, known Arctic and Antarctic polar ice cover changes) and see which are most accurate.
If any are not-as-bad-as-the-rest-but-not-right (within 2 std deviation at least), we should throw out the worst 20 models, modify the remaining 4 and continue to re-run them until they duplicate the past 50 years accurately. Then wait and see which of the corrected 4 is best. In the meantime, fix the bad 20 that were originally trashed.
Correct?
John F. Hultquist says: @ur momisugly September 5, 2013 at 9:08 am
….This is one of at least 3 explanations for the term “Horse Latitudes”
>>>>>>>>>>>>>>>>>>>
I always figured they would eat the beasts not throw them over board….
Isn’t the rewriting of history great? (Just don’t tell all those kids sweating through there history finals)
Theo Goodwin says: “Very well said. The “radiation-only theory,” used by all Alarmists is purely deterministic….”
And yet these same Alarmists claim there is such a thing as thermal inertia in a system driven by instantaneous radiative transfer.
– – – – – – – –
rgbatduke,
Your comment was helpful. Thanks.
Considering your whole comment, what if a large voluntary group of independent (of government)) academic institutions decide to start an evaluation of climate. Decide to do it without relying on the current body of IPCC bureaucracies, processes and reports. Lets say their mission would be the integration of a very widely balanced sample of climate research into a consistent comprehensive overview; a mission without a mandate to look for evidence of any particular climate factor (for example anthropogenic from burning fossil fuels). Lets say the product of the consortium is for itself but is freely available to anyone or any government. Government isn’t the target of the product.
Question for RGB => In that scenario would you expect models would have the significance given to them by the IPCC? What reasonable role for models would you suggest in the scenario?
John
Richardscourtney: If you do a google search of the title more pop up and you should be able to choose which one works. Please post about this paper if you can.
Much thanks.