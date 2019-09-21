Guest post by Kevin Kilty
Introduction
This short essay was prompted by a recent article regarding improvements to uncertainty in a global mean temperature estimate.[1] However, much bandwidth has been spilt lately in the related topic of error propagation [2, 3, 4], and so a small portion of this essay in its concluding remarks is devoted to it as well.
Manufacturing engineers work to improve product design, make products easier to manufacture, lower costs, and maintain or improve product quality. Among the tools they use to accomplish this, many are statistical in nature, and these have pertinence to the topic of the surface temperature record and its interpretation in the light of climate model projections. One tool I plan to present here is statistical process control (SPC).[5]
1. Ever Present Variation
Manufactured items cannot be made identically. Even in mass production under the control of machines, there are influences such as wear of the machine, variations in settings, skill of operators, incoming material property variations and so forth, which lead to variation in a final product. All precision manufacturing begins with an examination of two things. First, there is the customer specification. This includes all the important product parameters and the limits that these parameters must stay within. Functionality of a product suffers if these quality measures do not stay within limits. Second is the process capability. Any manufacturer worth the title will know how the process used to make products for a customer varies when it is in control. This leads the manufacturer to an estimate of how many products in a run will be outside tolerance, how many might be reworked and so forth. It is not possible to estimate costs and profits without knowing capability.
2. Process Capability and Control
If a manufacturer’s process can produce routinely within the specifications, perhaps only one in a hundred items, or one in a thousand or three in a million (six sigma) outside of it, whatever is cost effective and achievable, then the process is capable. If it proves not capable one might ask what cost in new machinery would make it capable, and if the answer is not cost effective one might pass on the manufacturing opportunity or have someone more capable handle it. When a process is in control, it is operating as well as is humanly possible considering one’s capability. A process in control is an important concept to our discussion.
3. Statistical Process Control
Statistical process control (SPC) is mainly a process of charting and interpreting measurements in real time. Various SPC charts become a tool through which an operator, potentially someone of modest training, can monitor a process and adjust it or stop it if indications are that it is drifting out of control. There are many different possible control charts, but a common one is the X −bar chart so named because the parameter being monitored and recorded on the chart is the mean attribute of a sample of manufactured items. Often it is paired with an R chart which shows the range within the same measurements. R is often used in manufacturing because it is capable of showing the same information about variation as say, standard deviation, but with much less calculation. Let’s discuss the X −bar chart. Figure 1 shows an example of a paired set of charts.[5]
Figure 1. A pair of control charts for X-bar and range. The X-bar chart shows measurements exceeding control limits above and below, while the range shows no increase in variability. We conclude an operator is unnecessarily changing machine settings. Source [5].
The chart begins with its construction. First, there is a specified target value for the process. A process is then designed to achieve this target. Then some number of measurements are taken from this process while it is known to be operating as well as is humanly possible – i.e. in control. Measurements are gathered into consecutive groups of fixed number, N (five and seven are common), and the mean of the means, and range of the means is calculated. Dead center horizontally across the chart is the target value then horizontal lines are placed above and below at some multiple of the process standard variation, measured by range or standard deviation. These are known as the process control limits (upper and lower control limits respectively UCL, LCL).
At this point one uses the chart to monitor an ongoing process. Think of charting as recording a continuing sequence of experiments. On a schedule our fixed number of manufactured items (N) are removed from production. The mean and range of some important attribute is calculated for this sample and the results plotted on their respective charts. The null hypothesis in each experiment is that the process continues to run just as it did during the chart creation period. As work proceeds the sequence of measured and plotted samples show either a pattern that is expected of a process in control, or a pattern of unexpected variations which suggest a process with problems. Observation by an operator of an unlikely pattern, such as; cycles, drift across the chart, too many points plotting outside control limits, or hugging one side of the chart, is evidence of a process out of control. An out of control process can be stopped temporarily while the process engineer or maintenance find and rectify the problem. One thing worth emphasizing is that SPC is a highly successful tool for handling variation in processes and identifying problems.
Figure 2. “…Comparison of a large set of climate model runs (CMIP5) with several observational temperature estimates. The thick black line is the mean of all model runs. The grey region is its model spread. The dotted lines show the model mean and spread with new estimates of the climate forcings. The coloured lines are 5 different estimates of the global mean annual temperature from weather stations and sea surface temperature observations….” Figures and description: Gavin Schmidt[6].
4. Ensemble of Models
Let’s turn attention to the subject of climate. The oft cited ensemble of model projections is something like a control chart. It represents a spread of model projections carefully initiated to represent what we believe is a future path of mean earth temperature with credible additions of CO2. It is not a plot of the full variation that climate models might conceivably produce, but rather more controlled variation of our expectations given what we know of climate and the differential equations representing it when it is in control. It is this in control concept that makes the process control chart and the projection ensemble similar to one another. The resemblance is even more complete with an overlay of observed temperature.
Figure 3. The grey 95% bounds of Figure 2 redrawn in skewed coordinates (blue/orange) to look more like a control chart. The grey lines indicate the envelope of observations. Black line is target.
This ensemble became controversial once people began placing observed temperatures on it. Schmidt produced one in a blog post in 2015.[6] Figure 2 shows it. What the comparison between observed and projected temperature showed, initially, was a trend of observed temperature across the ensemble. Some versions of similar graphs have observed temperatures departing from projections entirely.[7] Figures 3 and 4 show Figure 2 rotated into skewed coordinates to look more like a control chart monitoring a process. Schmidt states that Earth temperature are well contained within the ensemble – especially so after accounting for some extraneous factors (Figure 4). Yet, this misses an important point. The measurements in Figure 3 trend in an unlikely way across the ensemble, and have gone to running along the lower limit. After eliminating the trend in Figure 4 the comparison still shows observed temperatures hugging the lower end of the projections. Despite being told often that the departure of observations from the center of the ensemble is a non-issue, with each new comparison some unlikely features remains to fuel doubt. It is difficult to avoid concluding that what is wrong is one of the following.
(1) The models do run too hot. They overestimate warming from increasing CO2, possibly because of a flawed parameterization of clouds or some other factor.
(2) The observations are running too cool. What I mean is there are factors external to the models which are suppressing temperature in the real world. The models are not complete. Figure 2 from Realclimate.org takes exogenous factors into account. Yet, note that while the inclusion of these factors reduces the improbable trend across the diagram, it leaves the improbable tendency to cling to the lower half of the diagram, which suggests item 1 in this list again.
(3) The models and observations are of slightly different things. The observations mix unrelated things together, or contain corrections and processing not duplicated in the models.
Figure 4. The dashed (forced) 95% bounds of Figure 2 redrawn in skewed coordinates (blue/orange) to look more like a control chart. The grey lines indicate the envelope of observations. Black line is target.
These charts present data only through 2014, but while observed temperatures rose into the target region of the chart with the recent El Nino, they have more lately settled back to the lower part of the chart. It takes an extraordinary event to push observations toward the target region. One more observation about these graphs seems pertinent. If as Lenssen, et al, claim the 95% uncertainty bounds of the global mean temperatures are truly as small as 0.05C, then the spread in the various observations is, at times, unlikely itself.
Our little experiment here cannot settle the question of whether models run too hot, but two of our three possibilities suggest they do. It ought to be important to figure if this possibility is the truth.
5. Conclusion
The first draft of this essay concluded with the previous section. However, in the past few weeks there has been a lengthy discussion at WUWT about propagation of error, or what one could call propagation of uncertainty. The ensemble of model results is, in one point of view, an important monitor (like SPC) of health of our planet. If we believe that the assumptions going into production of the models are a true representation of how the Earth works, and if we are certain that our measurements represent the same thing the ensemble represents, then we arrive at the following: A trend across our control chart toward higher values suggests a worrying problem; a trend across the chart toward lower values suggests otherwise. But without some credible measure of bounds and resolution, no such use is reasonable.
One response of the climate science community to the apparent divergence of observations to models is to argue that there really is no divergence because the ensemble bounds could be widened to show true variability of the climate, and once this is done the ensemble limits will happily enclose observations. Or, they argue, there is no divergence if one takes into account exogenous factors ex post. But in my view arguing this way makes modeling pointless because it removes one’s ability to test anything. There is certainly a conflict between the desire to make uncertainties small, thus making a definitive scientific statement, and a desire to make the bounds larger to include the correct answer. The same point Vasquez and Whiting make here [8] …
”…Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources….”
is what Henrion and Fischoff [9] found to be so in the measurement of physical constants over 30 years ago. Propagation of error plays an important role in the interpretation of the bounds and resolution of models and data. It is more than just initiation errors being damped out in a GCM. But to discuss its pertinence would make this post too long. Perhaps we‘ll return in a week or two when that topic cools off.
6. Notes:
(1) Nathan J. L. Lenssen, et al., (2019) Improvements in the GISTEMP Uncertainty Model. JGR Atmospheres, 124, 6307-6326.
(2) Pat Frank https://wattsupwiththat.com/2019/09/19/emulation4-w-m-long-wave-cloud-forcing-error-and-meaning/
(3) R.C. Spencer, https://wattsupwiththat.com/2019/09/13/a-stovetop-analogy-to-climate-models/
(4) Nick Stokes, https://wattsupwiththat.com/2019/09/16/how-errorpropagation-works-with-differential-equations-and-gcms/
(5) AT&T Statistical Quality Control Handbook, Western Electric Co. Inc., 1985 Ed.
(6) RealClimate, NOAA temperature record updates and the ‘hiatus’ 4 June 2015, Accessed September 18, 2019.
(7) Ken Gregory, Epic Failure of the Canadian Climate Model,
https://wattsupwiththat.com/2013/10/24/epic-failure-of-the-canadianclimate-model/
(8) Victor R. Vasquez and Wallace B. Whiting, 2005, Accounting for Both Random Errors and Systematic Errors in Uncertainty Propagation Analysis of Computer Models Involving Experimental Measurements with Monte Carlo Methods, Risk Analysis, Volume25, Issue 6, Pages 1669-1681.
(9) Henrion, M., & Fischoff, B. (1986). Assessing uncertainty in physical constants. American Journal of Physics, 54( 9), 791– 798.
28 thoughts on “Do Models Run Hot Or Not? A Process Control View”
News Flash!!! The models will always run hot because they model a direct and near-linear relationship with CO2. CO2 demonstrates a near-linear uptrend established 12 years ago with the ending of the ice age. Additionally, the ground measurements are impacted by the UrbanHeat Island Effect and “adjusted” to show a near-linear warming trend. The data is being adjusted to reduce the models from warming hot. Y=mX+b guarantees that they need to make Y (Temp) more linear to adjust for the X (CO2). They are even adjusting the RSS data to make it more linear.
To prove this is all pure nonsense, simply go to NASA: https://data.giss.nasa.gov/gistemp/station_data_v3/
Identify the ground stations with 0 to 10 BI that existed prior to 1902 (when the Hockeystick dog-legs) and download the “UNADJUSTED” data. you will see that there has been Zero, Nada, Zip warming over the past 116 years. None of the stations that I’ve examined show anything close to a linear increase over that time. Many if not most show temperatures recently below those of the 1902 level.
Excellent graph, especially the coverage graph showing zilch for Southern Hemisphere pre 1900. And yet we are supposed to believe we know the temperatures for 1/2 the globe before 1900. Only in lala land.
???
Why the bias?
The graphs, used as described by CO2isLife, do not pretend or claim to know the temperature for 1/2 the globe before 1900.
In fact, CO2isLife explains;
That temperature measurements taken from long term temperature stations do not demonstrate warming.
Making your whine about 1/2 of the globe a red herring logical fallacy.
So with 10% coverage of half the globe in 1880, we are supposed to have confidence in any numbers?
I don’t whine. Just don’t understand the logic of imputing any numbers with such a gap in our knowledge.
well of course…..when you adjust past temps to show a faster rate of warming….to fit an agenda
and then tune the models to that….they reproduce that same fake rate of warming
when you put this in….. https://i0.wp.com/cdiac.ornl.gov/epubs/ndp/ushcn/ts.ushcn_anom25_diffs_urb-raw_pg.gif
you get this out…. http://wattsupwiththat.files.wordpress.com/2013/06/cmip5-73-models-vs-obs-20n-20s-mt-5-yr-means11.png
CO2 says:
News Flash!!! The models will always run hot because they model a direct and near-linear relationship with CO2. CO2 demonstrates a near-linear uptrend established 12 years ago with the ending of the ice age. Additionally, the ground measurements are impacted by the UrbanHeat Island Effect and “adjusted” to show a near-linear warming trend.
You & I know the billion-dollar models can be ridiculously simplified to a single linear equation of beefed-up (supposed H2O feedback) CO2 warming. That simplified linear equation has been shown here & other websites (climateauditdotcom for ex) numerous times in the past. Think of how many needy people could’ve been helped w/that money — instead laundered to the liberal elite apparatchiks (so-called climate-scientists)
The “ice age recovery” and “urban heat island” arguments are starting to wear a bit thin. UHI might be a factor in comparison with early 20th century temperatures but has had less influence on trend since the 1970s. Remember UAH whch is unaffected by UHI also shows a 0.5 deg increase since 1979. Similarly, While it’s reasonable to assume that some of the warming since 1850 is due to a “recovery”, the warming has continued into the 21st century even when natural factors such as ocean oscillations & solar activity have been operating in their ‘cool’ cycles.
Also your point about the number of stations is irrelevant unless you are trying to argue that the climate was warmer in earlier centuries. Station coverage has been more than adequate since the 1940s.
As CO2 accumulates in the upper atmosphere it is likely that the earth will warm. Whether that warming will be excessive or even harmful in any way is another matter. Most responsible sceptics (Lindzen, Curry, Spencer, Jack Barrett etc) know that this, i.e climate sensitivity, is the only area of uncertainty.
The earth is warming. The earth will continue to warm. Get used to it. Continuing to deny it simply destroys the credibility of the sceptic cause.
I’m one of those that agree (a lot) that models do produce more warming that the temperature that is observed in the real world. However, it is not a good idea to show graphics with data that ends in 2013 and 2014. If you show the latest data, it will not take out merit to this idea because most of the recent warming is a reminiscence from the latest El Nino event and I’ll bet that things will go back to a flat trend. Showing data only until 2013 is exposing this article to critics.
Neither.
Models run on empty.
(There’s nothing there to run.)
What “models”?
Is the author referring to the computer games, used by government bureaucrats with science degrees, to express their personal consensus opinions on what causes climate change, in a way that is very complex and appears to be “real science”, but the climate predictions are always wrong ?
Or is he referring to these models:
https://en.wikipedia.org/wiki/File:Claudia_Bertolero_Miami_Fashion_Week.jpg
A gentle off topic rant on the state of reference materials and today’s culture:
I was reminded by that ATT handbook of my copy given me as a new Western Electric employee in 1963.
I did a Google search to refresh my recollections and was dismayed to find nowhere was there mention of the author, a true pioneer of industrial Quality Control, Walter Shewhart.
Dr. Shewhart wrote the manual for the Hawthorne (Chicago) plant in 1924, and the manual I got in the 60’s was virtually unchanged. However the best any reference in Google would report was that the manual was assembled “by a committee” in 1956.
The “history of Now” generation no longer feel it necessary to honor giants of the past.
The worst example was the remake of the 1950s best selling book and movie “Cheaper by the Dozen”. Dr Frank Gilbreth was a pioneer in the technology of Industrial Engineering who with his wife Lillian had 13 children. The entertaining part of the story was a very large family being raised by Industrial Engineering rules. But the real heart of the book was that Frank died early with much of his work still incomplete, and Lillian while mother of 13, picked up the mantle and established herself as an important figure in the technology.
So what does Hollywood do? They hire Steve Martin for the remake lead, but decide Industrial Engineering is too boring so they change the historical characters name and change his job to football coach. Toss Frank and Lillian into the dustbin.
Rant over; please resume.
I thought about listing Shewhart’s book, now published by Dover, as a potential reference to learn about process control, but didn’t do so in the final draft. I work in the same engineering department that Edwards Deming graduated from in 1921. Everyone around here knows of him. No one, for some reason has heard of Shewhart.
Bell Labs and Western Electric employed an unbelievable number of talented scientists and engineers, Nyquist, Shannon, Shewhart, Pierce, Brattain, Holden, Bardeen, Shockley, Penzias, Wilson….quite a few Nobel Prize winners. What places they must have been to work at.
Is the title still in question? Really?
I don’t think so.
Purely speculation but I think heat tends to escape earth in ways we don’t fully understand and are difficult to model. That and possible negative cloud feedback can make models run hot. Time will tell. Another 10 years and we will start to have good data. It is very important to keep model source code around to test after times have psssed. Is the source code available so modelers can’t not change it ? We need to save all these graphs.
The above topic is relevant and in that vane I would suggest that climate science does not adequately address the uncertainty of measurements themselves, such as would be done by Gage R&R investigations.
There is no climate crisis as far as its behavior is concerned. Whether short term trends are being influenced by human CO2 emissions or not, nothing outside of normal variability is being observed.
They say they “found” the missing heat going into the oceans. (Covered under exogenous factors I assume)
If the ocean heat is a measurable factor it can be included in the analysis. But I don’t know if it can be relied on. Are sea temperatures reliably recorded for a long enough period of time to be able to dismiss the apparent bias we should have that the models are hot?
I think it was Trenberth who said “I’ve found the missing heat”, and before that that it was embarrassing that they couldn’t account for the “missing heat”.
It’s obvious from this that the mainstream climate modelers have been biased to the accuracy of their models over the observed temperatures, which is why I have a feeling the ocean heat is a too good to be true (for their models) factor that isn’t well verified.
My assumptions are that the ocean heat could explain the models but it’s not something that could be called consensus science given how recently it was ‘found’ and the obvious bias to accepting it without a second thought. Secondly, if the ocean is absorbing the proposed extra heat this is evidently a new change in the climate system that we don’t know the mechanics of and could go on for centuries. In geological timescales it would be completely unsurprising if this effect goes on long after effectively carbon free energy sources are developed. So there should still be a reasonable presumption given the data that the models are hot, and that the climate system may have also changed to damper surface temperatures down from the not so alarming projections we currently have.
The UN projects warming costs of 2 to 4% of GDP in 2100. It’s obvious that the cost analysis has been very biased towards finding potential costs rather than benefits. It’s also likely that the projected temperatures are high either because the models are hot or because the ocean is now absorbing more heat than before 2000.
In short, the least scientific analysis is to be alarmist given current data, and that’s even if you were to go with the UN’s own projection of costs which assume the models are correct and which have an obvious bias to assuming higher costs than benefits.
Yes I agree that if this heat is going into the deep oceans then the ocean provides a buffer of perhaps hundreds of years and warming of atmosphere then will not be a problem at all. Oceans have a massive capacity to store heat.
“(2) The observations are running too cool. ”
If there is something in the real world that is cooling temperatures, but this something is not included in the models, then this is a variation of (1), the models are running hot. (If there is something missing from the models, this a flaw in the models.)
The only time observations could be running cool is if there is some error in how observations are being taken that make the recorded temperatures lower than the actual temperatures.
At first blush it does seem that 2) is just the contrapositive of 1), but upon further reflection you might see that 2) refers to what we might call “wrong model” bias. 1) represents too much CO2 gain in models, while 2) leaves out real negative feedbacks and other influences. In any event some people in the climate science field make a distinction between 1) and 2), so I do as well.
I know Joe Bastardi has mentioned constantly that at least the American weather model forecast runs notoriously hot and cannot “see” cold air a week or more in the future. Telling that it can’t see cold but sees plenty of heat. Just a coincidence huh…..
There is the chance that the satellite temperature measurements accurately reflect physical reality. All the statistics in the world can’t confirm that or repudiate it.
“A trend across our control chart toward higher values suggests a worrying problem; a trend across the chart toward lower values suggests otherwise.”
But not in the case of Climate, down is disaster whereas up is getting back to normal.
I have come to the conclusion that few folks involved in climate science are indoctrinated in physical experimental science. They have become mathematicians, statisticians, and programmers that are removed from the real world and consequently their bias lies toward getting the (unreal) answer they want from the models rather than getting a real and repeatable projection of the world as it truly is.
I have mentioned on a few earlier threads that my university began teaching a probability and statistics course for new hires in engineering and the sciences because we found that many people were coming out of graduate school poorly grounded in experimental methods–ironically, those people better prepared turned out to be from the college of education because they had taken a research methods class or two. I suppose the thinking was that engineers and scientist should be able to teach research methods to themselves, but what actually happened is that research methods became an “unknown unknown”. They had become focused on teaching the theoretical aspects of their discipline. It is difficult to decide to teach yourself about some you are utterly unaware of.
The starting point in SPC is that you know what the dimension you are measuring is supposed to be. We do not know what the correct temperature is supposed to be.
True. In real process control we have at our disposal a real process to test and build a chart upon; and then we test exactly the same process sample by sample. In climate science we have only models to build the chart and compare them to observations which are different, of course–yet, the two seem similar in important ways.
The irony is that the warmists think they are forecasting temperature.
But all they doing is forecasting the rise in CO2 concentration.
I want to open a can of worms, here.
But is it real?
The unspoken assumption here is that the data is Gaussian. What this means is the data set has a central tendency, or a mean value, or a tendency more or less steadily increasing or decreasing. Very important is the fact that deviations from said core values are “random” or “randomly distributed” or “Gaussian”. Then we base all our calculations on Gaussian statistics, including mean, standard deviation, 95% confidence intervals, and the revered “wee p values”. And so on.
But what happens when the data distribution is not “random” or Gaussian”. What if there are well known influences on the measured parameter that are absolutely *not* random. Like perhaps a train of El Nino – La Nina cycles. A huge Super El Nino right in the middle of the record. A weird aborted El Nino – Pacific Hot Pool – real El Nino at the end of the record. Not predicted, not necessarily well understood, but *not* Gaussian distribution.
Now what can we say about the chances of this, or of that, or in control, or out of control, or the chance of 0.5 degree difference?
Statisticians: Have at it!