Do Models Run Hot Or Not? A Process Control View

Guest post by Kevin Kilty

Introduction

This short essay was prompted by a recent article regarding improvements to uncertainty in a global mean temperature estimate.[1] However, much bandwidth has been spilt lately in the related topic of error propagation [2, 3, 4], and so a small portion of this essay in its concluding remarks is devoted to it as well.

Manufacturing engineers work to improve product design, make products easier to manufacture, lower costs, and maintain or improve product quality. Among the tools they use to accomplish this, many are statistical in nature, and these have pertinence to the topic of the surface temperature record and its interpretation in the light of climate model projections. One tool I plan to present here is statistical process control (SPC).[5]

1. Ever Present Variation

Manufactured items cannot be made identically. Even in mass production under the control of machines, there are influences such as wear of the machine, variations in settings, skill of operators, incoming material property variations and so forth, which lead to variation in a final product. All precision manufacturing begins with an examination of two things. First, there is the customer specification. This includes all the important product parameters and the limits that these parameters must stay within. Functionality of a product suffers if these quality measures do not stay within limits. Second is the process capability. Any manufacturer worth the title will know how the process used to make products for a customer varies when it is in control. This leads the manufacturer to an estimate of how many products in a run will be outside tolerance, how many might be reworked and so forth. It is not possible to estimate costs and profits without knowing capability.

2. Process Capability and Control

If a manufacturer’s process can produce routinely within the specifications, perhaps only one in a hundred items, or one in a thousand or three in a million (six sigma) outside of it, whatever is cost effective and achievable, then the process is capable. If it proves not capable one might ask what cost in new machinery would make it capable, and if the answer is not cost effective one might pass on the manufacturing opportunity or have someone more capable handle it. When a process is in control, it is operating as well as is humanly possible considering one’s capability. A process in control is an important concept to our discussion.

3. Statistical Process Control

Statistical process control (SPC) is mainly a process of charting and interpreting measurements in real time. Various SPC charts become a tool through which an operator, potentially someone of modest training, can monitor a process and adjust it or stop it if indications are that it is drifting out of control. There are many different possible control charts, but a common one is the X −bar chart so named because the parameter being monitored and recorded on the chart is the mean attribute of a sample of manufactured items. Often it is paired with an R chart which shows the range within the same measurements. R is often used in manufacturing because it is capable of showing the same information about variation as say, standard deviation, but with much less calculation. Let’s discuss the X −bar chart. Figure 1 shows an example of a paired set of charts.[5]

Figure 1. A pair of control charts for X-bar and range. The X-bar chart shows measurements exceeding control limits above and below, while the range shows no increase in variability. We conclude an operator is unnecessarily changing machine settings. Source [5].

The chart begins with its construction. First, there is a specified target value for the process. A process is then designed to achieve this target. Then some number of measurements are taken from this process while it is known to be operating as well as is humanly possible – i.e. in control. Measurements are gathered into consecutive groups of fixed number, N (five and seven are common), and the mean of the means, and range of the means is calculated. Dead center horizontally across the chart is the target value then horizontal lines are placed above and below at some multiple of the process standard variation, measured by range or standard deviation. These are known as the process control limits (upper and lower control limits respectively UCL, LCL).

At this point one uses the chart to monitor an ongoing process. Think of charting as recording a continuing sequence of experiments. On a schedule our fixed number of manufactured items (N) are removed from production. The mean and range of some important attribute is calculated for this sample and the results plotted on their respective charts. The null hypothesis in each experiment is that the process continues to run just as it did during the chart creation period. As work proceeds the sequence of measured and plotted samples show either a pattern that is expected of a process in control, or a pattern of unexpected variations which suggest a process with problems. Observation by an operator of an unlikely pattern, such as; cycles, drift across the chart, too many points plotting outside control limits, or hugging one side of the chart, is evidence of a process out of control. An out of control process can be stopped temporarily while the process engineer or maintenance find and rectify the problem. One thing worth emphasizing is that SPC is a highly successful tool for handling variation in processes and identifying problems.

Figure 2. “…Comparison of a large set of climate model runs (CMIP5) with several observational temperature estimates. The thick black line is the mean of all model runs. The grey region is its model spread. The dotted lines show the model mean and spread with new estimates of the climate forcings. The coloured lines are 5 different estimates of the global mean annual temperature from weather stations and sea surface temperature observations….” Figures and description: Gavin Schmidt[6].

4. Ensemble of Models

Let’s turn attention to the subject of climate. The oft cited ensemble of model projections is something like a control chart. It represents a spread of model projections carefully initiated to represent what we believe is a future path of mean earth temperature with credible additions of CO₂. It is not a plot of the full variation that climate models might conceivably produce, but rather more controlled variation of our expectations given what we know of climate and the differential equations representing it when it is in control. It is this in control concept that makes the process control chart and the projection ensemble similar to one another. The resemblance is even more complete with an overlay of observed temperature.

Figure 3. The grey 95% bounds of Figure 2 redrawn in skewed coordinates (blue/orange) to look more like a control chart. The grey lines indicate the envelope of observations. Black line is target.

This ensemble became controversial once people began placing observed temperatures on it. Schmidt produced one in a blog post in 2015.[6] Figure 2 shows it. What the comparison between observed and projected temperature showed, initially, was a trend of observed temperature across the ensemble. Some versions of similar graphs have observed temperatures departing from projections entirely.[7] Figures 3 and 4 show Figure 2 rotated into skewed coordinates to look more like a control chart monitoring a process. Schmidt states that Earth temperature are well contained within the ensemble – especially so after accounting for some extraneous factors (Figure 4). Yet, this misses an important point. The measurements in Figure 3 trend in an unlikely way across the ensemble, and have gone to running along the lower limit. After eliminating the trend in Figure 4 the comparison still shows observed temperatures hugging the lower end of the projections. Despite being told often that the departure of observations from the center of the ensemble is a non-issue, with each new comparison some unlikely features remains to fuel doubt. It is difficult to avoid concluding that what is wrong is one of the following.

(1) The models do run too hot. They overestimate warming from increasing CO₂, possibly because of a flawed parameterization of clouds or some other factor.

(2) The observations are running too cool. What I mean is there are factors external to the models which are suppressing temperature in the real world. The models are not complete. Figure 2 from Realclimate.org takes exogenous factors into account. Yet, note that while the inclusion of these factors reduces the improbable trend across the diagram, it leaves the improbable tendency to cling to the lower half of the diagram, which suggests item 1 in this list again.

(3) The models and observations are of slightly different things. The observations mix unrelated things together, or contain corrections and processing not duplicated in the models.

Figure 4. The dashed (forced) 95% bounds of Figure 2 redrawn in skewed coordinates (blue/orange) to look more like a control chart. The grey lines indicate the envelope of observations. Black line is target.

These charts present data only through 2014, but while observed temperatures rose into the target region of the chart with the recent El Nino, they have more lately settled back to the lower part of the chart. It takes an extraordinary event to push observations toward the target region. One more observation about these graphs seems pertinent. If as Lenssen, et al, claim the 95% uncertainty bounds of the global mean temperatures are truly as small as 0.05C, then the spread in the various observations is, at times, unlikely itself.

Our little experiment here cannot settle the question of whether models run too hot, but two of our three possibilities suggest they do. It ought to be important to figure if this possibility is the truth.

5. Conclusion

The first draft of this essay concluded with the previous section. However, in the past few weeks there has been a lengthy discussion at WUWT about propagation of error, or what one could call propagation of uncertainty. The ensemble of model results is, in one point of view, an important monitor (like SPC) of health of our planet. If we believe that the assumptions going into production of the models are a true representation of how the Earth works, and if we are certain that our measurements represent the same thing the ensemble represents, then we arrive at the following: A trend across our control chart toward higher values suggests a worrying problem; a trend across the chart toward lower values suggests otherwise. But without some credible measure of bounds and resolution, no such use is reasonable.

One response of the climate science community to the apparent divergence of observations to models is to argue that there really is no divergence because the ensemble bounds could be widened to show true variability of the climate, and once this is done the ensemble limits will happily enclose observations. Or, they argue, there is no divergence if one takes into account exogenous factors ex post. But in my view arguing this way makes modeling pointless because it removes one’s ability to test anything. There is certainly a conflict between the desire to make uncertainties small, thus making a definitive scientific statement, and a desire to make the bounds larger to include the correct answer. The same point Vasquez and Whiting make here [8] …

”…Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources….”

is what Henrion and Fischoff [9] found to be so in the measurement of physical constants over 30 years ago. Propagation of error plays an important role in the interpretation of the bounds and resolution of models and data. It is more than just initiation errors being damped out in a GCM. But to discuss its pertinence would make this post too long. Perhaps we‘ll return in a week or two when that topic cools off.

6. Notes:

(1) Nathan J. L. Lenssen, et al., (2019) Improvements in the GISTEMP Uncertainty Model. JGR Atmospheres, 124, 6307-6326.

(2) Pat Frank https://wattsupwiththat.com/2019/09/19/emulation4-w-m-long-wave-cloud-forcing-error-and-meaning/

(3) R.C. Spencer, https://wattsupwiththat.com/2019/09/13/a-stovetop-analogy-to-climate-models/

(4) Nick Stokes, https://wattsupwiththat.com/2019/09/16/how-errorpropagation-works-with-differential-equations-and-gcms/

(5) AT&T Statistical Quality Control Handbook, Western Electric Co. Inc., 1985 Ed.

(6) RealClimate, NOAA temperature record updates and the ‘hiatus’ 4 June 2015, Accessed September 18, 2019.

(7) Ken Gregory, Epic Failure of the Canadian Climate Model,

https://wattsupwiththat.com/2013/10/24/epic-failure-of-the-canadianclimate-model/

(8) Victor R. Vasquez and Wallace B. Whiting, 2005, Accounting for Both Random Errors and Systematic Errors in Uncertainty Propagation Analysis of Computer Models Involving Experimental Measurements with Monte Carlo Methods, Risk Analysis, Volume25, Issue 6, Pages 1669-1681.

(9) Henrion, M., & Fischoff, B. (1986). Assessing uncertainty in physical constants. American Journal of Physics, 54( 9), 791– 798.

0 0 votes

Article Rating

215 Comments

CO2isLife

September 21, 2019 6:36 am

News Flash!!! The models will always run hot because they model a direct and near-linear relationship with CO2. CO2 demonstrates a near-linear uptrend established 12 years ago with the ending of the ice age. Additionally, the ground measurements are impacted by the UrbanHeat Island Effect and “adjusted” to show a near-linear warming trend. The data is being adjusted to reduce the models from warming hot. Y=mX+b guarantees that they need to make Y (Temp) more linear to adjust for the X (CO2). They are even adjusting the RSS data to make it more linear.

To prove this is all pure nonsense, simply go to NASA: https://data.giss.nasa.gov/gistemp/station_data_v3/

Identify the ground stations with 0 to 10 BI that existed prior to 1902 (when the Hockeystick dog-legs) and download the “UNADJUSTED” data. you will see that there has been Zero, Nada, Zip warming over the past 116 years. None of the stations that I’ve examined show anything close to a linear increase over that time. Many if not most show temperatures recently below those of the 1902 level.

cerescokid

Reply to CO2isLife

September 21, 2019 7:30 am

Excellent graph, especially the coverage graph showing zilch for Southern Hemisphere pre 1900. And yet we are supposed to believe we know the temperatures for 1/2 the globe before 1900. Only in lala land.

ATheoK

Reply to cerescokid

September 21, 2019 7:45 am

???
Why the bias?

The graphs, used as described by CO2isLife, do not pretend or claim to know the temperature for 1/2 the globe before 1900.

In fact, CO2isLife explains;

“Identify the ground stations with 0 to 10 BI that existed prior to 1902 (when the Hockeystick dog-legs) and download the “UNADJUSTED” data. you will see that there has been Zero, Nada, Zip warming over the past 116 years.”

That temperature measurements taken from long term temperature stations do not demonstrate warming.

Making your whine about 1/2 of the globe a red herring logical fallacy.

cerescokid

Reply to ATheoK

September 21, 2019 8:56 am

So with 10% coverage of half the globe in 1880, we are supposed to have confidence in any numbers?
I don’t whine. Just don’t understand the logic of imputing any numbers with such a gap in our knowledge.

George W. Childs

Reply to cerescokid

September 21, 2019 9:26 am

GIGO

Latitude

Reply to CO2isLife

September 21, 2019 7:38 am

well of course…..when you adjust past temps to show a faster rate of warming….to fit an agenda
and then tune the models to that….they reproduce that same fake rate of warming

when you put this in…..

you get this out…. http://wattsupwiththat.files.wordpress.com/2013/06/cmip5-73-models-vs-obs-20n-20s-mt-5-yr-means11.png

beng135

Reply to CO2isLife

September 21, 2019 8:36 am

CO2 says:
News Flash!!! The models will always run hot because they model a direct and near-linear relationship with CO2. CO2 demonstrates a near-linear uptrend established 12 years ago with the ending of the ice age. Additionally, the ground measurements are impacted by the UrbanHeat Island Effect and “adjusted” to show a near-linear warming trend.

You & I know the billion-dollar models can be ridiculously simplified to a single linear equation of beefed-up (supposed H2O feedback) CO2 warming. That simplified linear equation has been shown here & other websites (climateauditdotcom for ex) numerous times in the past. Think of how many needy people could’ve been helped w/that money — instead laundered to the liberal elite apparatchiks (so-called climate-scientists)

John Finn

Reply to CO2isLife

September 21, 2019 8:43 am

The “ice age recovery” and “urban heat island” arguments are starting to wear a bit thin. UHI might be a factor in comparison with early 20th century temperatures but has had less influence on trend since the 1970s. Remember UAH whch is unaffected by UHI also shows a 0.5 deg increase since 1979. Similarly, While it’s reasonable to assume that some of the warming since 1850 is due to a “recovery”, the warming has continued into the 21st century even when natural factors such as ocean oscillations & solar activity have been operating in their ‘cool’ cycles.

Also your point about the number of stations is irrelevant unless you are trying to argue that the climate was warmer in earlier centuries. Station coverage has been more than adequate since the 1940s.

As CO2 accumulates in the upper atmosphere it is likely that the earth will warm. Whether that warming will be excessive or even harmful in any way is another matter. Most responsible sceptics (Lindzen, Curry, Spencer, Jack Barrett etc) know that this, i.e climate sensitivity, is the only area of uncertainty.

The earth is warming. The earth will continue to warm. Get used to it. Continuing to deny it simply destroys the credibility of the sceptic cause.

Archer

Reply to John Finn

September 21, 2019 9:49 am

It’s amazing. Everything you just said is wrong.

John Finn

Reply to Archer

September 21, 2019 10:40 am

By all means tell me what it is I’ve got wrong. If I’ve got it wrong then many of the most qualified leading sceptics have it wrong.

I’ve been closely following the climate change debate for over 15 years. I challenged Michael Mann on the ‘hide the decline’ issue 5 years before Climategate. I’m not an alarmist but I do understand what the data is telling us. We are warming.

Tim Gorman

Reply to John Finn

September 21, 2019 11:50 am

What is warming is the AVERAGE! What does a warming average tell you? Does it tell you if maximum temperatures are going up? Does it tell you if minimum temperatures are going up?

There are lots of places where maximums are actually going down, e.g. the entire central US. Global warming holes are being identified all over the globe, e.g. in Siberia.

What is wrong is using an increasing average global temperature to forecast anything for the globe. If maximum temps are moderating and minimum temps are as well we will be headed for a bright future with longer growing seasons around the globe, more food, less starvation, and fewer deaths from cold weather. All this can happen while the global average temperature is RISING!

Robbie

Reply to John Finn

September 21, 2019 2:18 pm

This is definitely the case. I have heard before on a skeptic site that the temperature record is not notably changed by investigating issues with individual sites like the urban heat effect.

The trademarked 97% consensus among scientists is accurate, including almost all skeptics, except in the way that what that consensus actually is is implied to be. (High warming with high costs)

The 97% study doesn’t just theoretically include skeptics, it specifically included papers by well known skeptics. The parameters for agreeing with the consensus are a belief that it has warmed up since ~1850 and that some amount of that warming is CO2 and otherwise human caused.

Skepticism is about how large the CO2 effect is likely to be, what the costs and benefits of that warming are likely to be, and what the most reasonable policies are in response.

DRoberts

Reply to John Finn

September 21, 2019 8:47 pm

You are including the volcanoes which cooled the surface considerably, so it is disingenuous to claim UAH shows the earth warmed by .5C since 1979.

Also, according to the GHE “theory”, the highest rate of temperature increase should be in the tropical troposphere which UAH definitely does not support.

So here’s a simple task for you; show that a small change in cloud cover has been ruled out by the scientific method.

It’s quite simple you know
https://youtu.be/EYPapE-3FRw

I remember you used to hang out at Solar Cycle 24 and continually got your ears pinned back for such incorrect statements and error by omission.

Derg

Reply to John Finn

September 21, 2019 10:34 am

….”it is likely that the earth will warm.”

Likely 😉

“U.N. Predicts Disaster if Global Warming Not Checked
PETER JAMES SPIELMANN
June 29, 1989
UNITED NATIONS (AP) _ A senior U.N. environmental official says entire nations could be wiped off the face of the Earth by rising sea levels if the global warming trend is not reversed by the year 2000…”

https://www.apnews.com/bd45c372caf118ec99964ea547880cd0

MarkW

Reply to John Finn

September 21, 2019 2:34 pm

When air pressure gets low enough, CO2 actually helps heat to escape.

Paul

Reply to John Finn

September 21, 2019 3:49 pm

As the co2 accumulates in the atmosphere I likely get older. So what!

Richard Verney

Reply to John Finn

September 22, 2019 1:44 am

A serious question: has the planet rebounded from the Little Ice Age before such time that it reaches the temperature that it had before the descent into the Little Ice Age?

There is a logical argument that the rebound from and out of the LIA is not complete until such time as we see temperatures that were seen in the Medieval Warm Period.

No one knows precisely how warm it was in the MWP, but there is plenty of evidence to suggests that it was warmer than today. Thus there is a strong argument that the rebound from and out of the LIA is not yet complete, and the 20th century warming was simply part of the rebound.

Chris Wright

Reply to Richard Verney

September 22, 2019 3:05 am

Yes, the LIA is just as inconvenient for the believers as the MWP.
Recently I was looking at a paper by a concensus scientist. It mentioned in passing that the LIA cooling was one degree C. And how much warming has occurred since the end of the LIA? One degree C.

It’s difficult to think of a simpler and more effective argument against extreme AGW. That’s why Mann’s fraudulent hockeystick was so important: it got rid of the LIA as well as the MWP.
Chris

mimiR

Reply to Richard Verney

September 23, 2019 3:58 am

No. It has not.

That silly person above was cherry-picking the decade that was the the coldest that the earth had been since the 1800s, after nearly 4 decades of dramatic cooling and at the HEIGHT of the global cooling scare for comparison.

The 1930s were hotter than EVERY DECADE SINCE. Every. Single. One. And it still wasn’t as warm as the medieval warm period.

The only graphs that don’t show this are those that “adjust” the 1930s cooler–in the US as well as global data–and generally also warm up the global cooling scare so that the temp looks more like a gradual rise instead of a spike, a crash, a slow rise to PARTIAL recovery, and a weak flat wobble since then.

To buy the global warming nonsense, you have to throw out actual recorded history as bunkum. The Dust Bowl did not happen. The Krakatoa volcanic winter? Now not really a big thing anymore! Year Without a Summer? What Year Without a Summer? The Medieval Warm Period and the Minoan Warm Period are erased and the Little Ice Age smoothed out.

So let’s talk evidence, actual physical things that you can see and touch.

1.) AGW must insist that the world is as hot as it has ever been in more than 1000 years.
2.) ALL climate science declares that the high latitudes heat up faster than tropical latitudes.
3.) AGW proponents go further and insist that the northern high latitudes are, have been, and always will be heating up the fastest of any place on earth–much more than in the southern latitudes. Greenland is supposed to be ESPECIALLY hot.
4.) We have tree roots grown through Viking bones from the first settlements in Greenland. We also have incontrovertible evidence that Vikings grew and harvested barley in Greenland.
5.) We cannot grow trees in Greenland right now. It is too cold. We cannot grow and harvest even the most cold-resistant barley in Greenland right now. It is too cold.

The conclusion from this, if you buy AGW, is that Greenland is hotter than it used to be, and the Medieval Warm Period was just a mistake, and LALALALALALAALAALALA I DONT HEAR YOU WHAT BARLEY????

You have TWO choices. You either throw out all known history and physical evidence as potentially debunkable by AGW, or you realize that the model the AGW consortium is replying on and all of their massive “adjustments” must be wrong.

September 21, 2019 6:51 am

I’m one of those that agree (a lot) that models do produce more warming that the temperature that is observed in the real world. However, it is not a good idea to show graphics with data that ends in 2013 and 2014. If you show the latest data, it will not take out merit to this idea because most of the recent warming is a reminiscence from the latest El Nino event and I’ll bet that things will go back to a flat trend. Showing data only until 2013 is exposing this article to critics.

Kevin kilty

Reply to JN

September 21, 2019 10:00 am

I thought I had responded to this comment earlier. Your point is well taken. However, I did look at the post 2015 data, and commented on its effect on the graphs. A more important consideration for me was finding a graph produced by a well-known climate scientist that also included an explanation for the departure between models and observations. I wanted to show that this explanation missed the important idea about looking at data from the standpoint of “does the chart show something improbable?”

It was a calculated judgement on my part. I knew someone would point this out.

Reply to Kevin kilty

September 21, 2019 12:49 pm

Fair Kevin, and for me it’s perfectly fine as you pointed it. However you realize that alarmists will cling to it tooth and nail to dismiss the idea. That’s why I always prefer to use all the available data and, even so, point out the fragilities of the model based alarmist point of view.

Nick Stokes

Reply to Kevin kilty

September 21, 2019 4:45 pm

“However, I did look at the post 2015 data, and commented on its effect on the graphs. “
Well, you said
“but while observed temperatures rose into the target region of the chart with the recent El Nino, they have more lately settled back to the lower part of the chart.”

But in fact, in July 2019, relative to that 1980-1999 base, GISS was at 0.62°C, NCDC at 0.58, HADCRUT at 0.52. These seem pretty close to mid-range.

Nick Stokes

Reply to Kevin kilty

September 21, 2019 6:31 pm

Here, from here, is Gavin’s plot updated to end 2018. It went over mid-range in 2016 and then dropped below, but not much. 2019 will be very close to midrange.

Reply to Nick Stokes

September 21, 2019 8:57 pm

Now Nick, the central question – in which one do you believe most? The Fig2 presented by Kevin or the ones presented by Gavin? Thanks for the links for Gavin’s graphs. It’s so in “the mouche” at all time that I guess that not even Gavin believes in those… We must be, at least, a little bit intellectually honest to be credible I guess. There will be a “before” and an “after” those Gavin’s graphs for sure 🙂

Reply to Nick Stokes

September 21, 2019 9:09 pm

Gavin seems to have found the trick to hide the decline for sure. He seems to recenter the “0” in the y axis to make things match. Just a thought… One thing is for sure, one of the graphs (Kevin’s or Gavin’s is wrong because they do not match both in data or in Y axis “0” position.

Reply to Nick Stokes

September 21, 2019 9:26 pm

Nick, why did you pick up the graph with CMIP3 (circa 2004) when you could pick up the CMIP5 (circa 2011), that also was updated by Gavin with data from 2019 (http://www.realclimate.org/images//cmp_cmip5_sat_ann-2.png)
In this case, 2019 is going “dangereously” close the lower range os the “ensemble”. Are you “cherry picking?” 🙂

Nick Stokes

Reply to Nick Stokes

September 21, 2019 11:11 pm

JN,
“The Fig2 presented by Kevin or the ones presented by Gavin?”
According to the author, Fig 2 is the one presented by Gavin.

” He seems to recenter the “0” in the y axis”
He’s using an updated reference period, 1980-2019. That changes the numbers on the axis, but not the relative positions of the curves.

“why did you pick up the graph with CMIP3 (circa 2004)”
It was the one I found, and didn’t notice it was CMIP3. But there is a reason for CMIP3. The forecast period is twice as long (since 2004).

Reply to Nick Stokes

September 22, 2019 5:01 am

WHAT?! I work with both scenarios and now that is new to me!
“why did you pick up the graph with CMIP3 (circa 2004)”
It was the one I found, and didn’t notice it was CMIP3. But there is a reason for CMIP3. The forecast period is twice as long (since 2004).

Nick, sorry to say but it seems that you have noticed that you made a miss judgement and now are trying to use very funny arguments to not admit it. CIMP3 and CIMP5 do not have different periods of observation. CIMP5 has more models, most models are more advanced and it is the one that was used primarily for the 5th assessment report from the IPCC. I could elaborate a lot more about both what are these two and the main differences but I guess that is only a question of you to google it. In this case you can try to ”twist” things to justify why you used the graph that was on the same page than one that was more adequate but it did not fit in your pre conception. Are you able, at least, to admit that, in CIMP5 graph, real data is more frequently below modeled average and 2019 is approaching to the lower limit? I really like your comments and posts, usually with views rather based on logical arguments but, in this case, you are almost behaving like a child who is unable to admit something it did wrong…
Models do produce more warming that real world observations. I know that, you know that and most people from the IPCC also know that.

Ragnaar

Reply to Nick Stokes

September 22, 2019 2:34 pm

CMIP3. Lame. From Schmidt.

Nick Stokes

Reply to Nick Stokes

September 22, 2019 5:42 pm

“Are you able, at least, to admit that, in CIMP5 graph, real data is more frequently below modeled average and 2019 is approaching to the lower limit”
Yes, observations have been on the low side of the average of models.
No, 2019 will not be approaching the lower limit. It will probably be the second warmest year on record, and fairly close (on the low side) to mid-range of the forecasts.

Nick Schroeder

September 21, 2019 6:57 am

Neither.

Models run on empty.

(There’s nothing there to run.)

Richard Greene

Reply to Nick Schroeder

September 21, 2019 8:46 am

What “models”?

Is the author referring to the computer games, used by government bureaucrats with science degrees, to express their personal consensus opinions on what causes climate change, in a way that is very complex and appears to be “real science”, but the climate predictions are always wrong ?

Or is he referring to these models:

George Daddis

September 21, 2019 6:58 am

A gentle off topic rant on the state of reference materials and today’s culture:

I was reminded by that ATT handbook of my copy given me as a new Western Electric employee in 1963.
I did a Google search to refresh my recollections and was dismayed to find nowhere was there mention of the author, a true pioneer of industrial Quality Control, Walter Shewhart.
Dr. Shewhart wrote the manual for the Hawthorne (Chicago) plant in 1924, and the manual I got in the 60’s was virtually unchanged. However the best any reference in Google would report was that the manual was assembled “by a committee” in 1956.

The “history of Now” generation no longer feel it necessary to honor giants of the past.

The worst example was the remake of the 1950s best selling book and movie “Cheaper by the Dozen”. Dr Frank Gilbreth was a pioneer in the technology of Industrial Engineering who with his wife Lillian had 13 children. The entertaining part of the story was a very large family being raised by Industrial Engineering rules. But the real heart of the book was that Frank died early with much of his work still incomplete, and Lillian while mother of 13, picked up the mantle and established herself as an important figure in the technology.
So what does Hollywood do? They hire Steve Martin for the remake lead, but decide Industrial Engineering is too boring so they change the historical characters name and change his job to football coach. Toss Frank and Lillian into the dustbin.

Rant over; please resume.

mkelly

Reply to George Daddis

September 21, 2019 8:15 am

Deming was sent to Japan after WWII to help Japan rebuild their manufacturing. He taught them SPC.

Kevin kilty

Reply to George Daddis

September 21, 2019 8:22 am

I thought about listing Shewhart’s book, now published by Dover, as a potential reference to learn about process control, but didn’t do so in the final draft. I work in the same engineering department that Edwards Deming graduated from in 1921. Everyone around here knows of him. No one, for some reason has heard of Shewhart.

Bell Labs and Western Electric employed an unbelievable number of talented scientists and engineers, Nyquist, Shannon, Shewhart, Pierce, Brattain, Holden, Bardeen, Shockley, Penzias, Wilson….quite a few Nobel Prize winners. What places they must have been to work at.

4kx3

Reply to Kevin kilty

September 21, 2019 12:12 pm

Kevin Kilty
Thank you for the essay. I would never have thought to use Shewhart and climate science in the same paragraph. There are numerous references to Shewhart on the web, beginning with ASQ.ORG.

Rick C PE

Reply to Kevin kilty

September 21, 2019 3:41 pm

Kevin: Great article. I graduated college with a math (statistics and probability) degree but didn’t learn much useful until I got a job in SPC. Then I was introduced to Schewhart, Deming and Juran among others. I was a member of ASQC for many years a knew Hi Pitt. Western Electric’s rules were our guiding light in SPC. One of the great lessons of this education was that data can easily mislead and one must always be on the look out for confirmation bias, particularly when QC folks are talking to production or marketing. 😉

Rick C PE

Reply to Rick C PE

September 21, 2019 5:07 pm

That’s Hy Pitt.

Jeff Id

September 21, 2019 7:07 am

Is the title still in question? Really?

I don’t think so.

Javert Chip

Reply to Jeff Id

September 21, 2019 12:39 pm

Jeff

The models only look hot when they’re compared to the actual temperature data (note: “actual temp data” is becoming less and less “actual”).

/sarc off

Stevek

September 21, 2019 7:20 am

Purely speculation but I think heat tends to escape earth in ways we don’t fully understand and are difficult to model. That and possible negative cloud feedback can make models run hot. Time will tell. Another 10 years and we will start to have good data. It is very important to keep model source code around to test after times have psssed. Is the source code available so modelers can’t not change it ? We need to save all these graphs.

co2isnotevil

Reply to Stevek

September 21, 2019 10:11 am

Heat leaves the Earth in a well defined, quantifiable, manner. This is as photons, emitted by the surface, emitted by clouds, emitted by particulates or re-emitted by atmospheric GHG’s.

People get confused by the complexity at the boundary between the surface and the atmosphere, but that arose as Trenberth inappropriately conflating the energy transported by photons (BB emissions of the surface, clouds, poarticulates and re-emissions by GHG’s) with the energy transported by matter (latent heat and convection). The reason this is invalid is because only the energy transported by photons can leave the planet.

Any energy transported into the atmosphere from the surface by matter can only be returned to the surface. To the extent that this matter emits photons, the matter is also absorbing photons and in the steady state is absorbing the same as its emitting. All the transport of energy by matter can do is redistribute existing energy around the planet.

kwinterkorn

Reply to co2isnotevil

September 21, 2019 1:24 pm

Too simple.

Potential Heat Energy, as energy of vaporisation, can be carried quite high in the atmosphere by convection, especially by thunderstorms….to several tens of thousands of feet, above much of the atmosphere, in terms of % of the molecules in the whole atmosphere. CO2 is dense relative to O2 and N2, thus relatively more concentrated in the lower atmosphere. This potential heat energy of vaporisation can be carried to heights above much of the CO2, and then released by condensation, and then is radiated in all directions….some downwelling, but about half directed upwards. This heat energy, as long wave IR, relatively escapes the CO2 greenhouse effect compared with the usual calculation of IR created near ground or sea level.

This effect has been mentioned on WUWT, by Willis E., I think. I do not think the models can account well for this, because the magnitude of this effect and its change as sea temps and hence evaporation rise, are not well known. For a sense of magnitude of the effect, we know that ocean surface temps drop a couple of degrees as a hurricane passes. Indeed, the more that heat is trapped by greenhouse gasses as upwelling IR from near the surface, the more important the convection process of heat escape may become, relatively.

Much talk of cloud cover’s complex effects has also clarified that the models are in their infancy, especially in their subcomponents related to positive and negative feedbacks.

The relative stability of the Earth’s climate over time indicates it is a complex system dominated by negative feedback. Until the models get the feedback part right, they will not predict climate change accurately.

co2isnotevil

Reply to kwinterkorn

September 21, 2019 3:24 pm

kwinterkorn,

It’s not too simple if it predicts the data, which it does better than any other model I’ve seen, so why add unnecessary complexity when it’s effect is already embodied by the mean? I’m a proponent of Occam’s Razor, so simpler is always better, especially when it works.

I get that most peoples gut instinct is to worry about what isn’t understood. Frankly, I think this is the motivation for adding so much excess complexity in the first place. Excess complexity provides the wiggle room to feign support what the laws of physics can not.

To be fair, my model has an additional constraint on cloud coverage that causes them to converge properly. That constraint is that the systems desires to maintain a constant average ratio between the RADIANT emissions by the surface and the emissions at TOA. This ratio can be measured and it’s average about 1.62 W/m^2 of radiant surface emissions per W/m^2 of radiant forcing from the Sun, moreover; this average is converged to very rapidly by the actions of clouds and is demonstrably consistent from pole to pole. When clouds are present, the ratio between surface emissions and planet emissions increases and under clear skies, the ratio between surface emissions and planet emissions decreases. It takes the perfect average ratios of cloudy to clear skies from pole to pole in order to maintain this relatively constant emissions ratio.

These 2 plots demonstrate the relatively constant average ratio of surface emissions to emissions at TOA across slices of latitude.

surface emissions Y, planet emissions X:
http://www.palisad.com/co2/sens/po/se.png
Surface temperature Y, planet emissions X (green line is ideal constant ratio):
http://www.palisad.com/co2/tp/fig1.png

This next plot shows the bizarre measured behavior of clouds that result in the maintainance of this relatively constant ratio.

Fraction of clouds Y, planet emissions X:
http://www.palisad.com/co2/sens/po/ca.png

Note that every dot in each plot corresponds to another dot on every other plot. More info about these plots and and many other similar plots can be found here:
http://www.palisad.com/co2/sens

One of 2 things must be true. A really bizarre hemispheric specific (blue vs. green dots) relationship between clouds and the planets emissions coincidentally results in a relatively constant ratio of surface emissions to planet emissions OR the climate system goal is a constant ratio of surface emissions to planet emissions and clouds adapt to achieve this goal. Again, I’ll invoke Occam’s Razor.

Whether or not the desired ratio is the golden mean or not is still up in the air, but since the measured value is within a percent of the golden mean, the climate is a chaotically self organized by clouds and the golden mean shows up in other chaotically self organized systems …

Scissor

September 21, 2019 7:28 am

The above topic is relevant and in that vane I would suggest that climate science does not adequately address the uncertainty of measurements themselves, such as would be done by Gage R&R investigations.

There is no climate crisis as far as its behavior is concerned. Whether short term trends are being influenced by human CO2 emissions or not, nothing outside of normal variability is being observed.

Kevin kilty

Reply to Scissor

September 21, 2019 9:23 am

It ought to impress people that through Gage R&R one can separate the influences that operators, measuring instruments, machines and even incoming materials have on variations in products or processes. But I have also taught enough introductory statistics courses to know that the average person struggles just to understand descriptive calculations — inferences and meaning are far beyond them.

Lots of good could come from people thinking just a little more like the ideal manufacturing engineer — they’d be a bit more skeptical, even about their own biases, and less inclined to “believe the magic”.

Scissor

Reply to Kevin kilty

September 21, 2019 9:48 am

The kind of “tricks” being used by climate scientists are pervasive and it would be good for these kinds of things to be addressed in statistics and philosophy courses, so just as you say people are a bit more skeptical. It would help people from being scammed even if only from robo calls and email phishing.

I took a logic course in college and there was a section on the most common fallacies. Learning to recognize these has been invaluable to me.

Robbie

September 21, 2019 7:56 am

They say they “found” the missing heat going into the oceans. (Covered under exogenous factors I assume)

If the ocean heat is a measurable factor it can be included in the analysis. But I don’t know if it can be relied on. Are sea temperatures reliably recorded for a long enough period of time to be able to dismiss the apparent bias we should have that the models are hot?

I think it was Trenberth who said “I’ve found the missing heat”, and before that that it was embarrassing that they couldn’t account for the “missing heat”.

It’s obvious from this that the mainstream climate modelers have been biased to the accuracy of their models over the observed temperatures, which is why I have a feeling the ocean heat is a too good to be true (for their models) factor that isn’t well verified.

My assumptions are that the ocean heat could explain the models but it’s not something that could be called consensus science given how recently it was ‘found’ and the obvious bias to accepting it without a second thought. Secondly, if the ocean is absorbing the proposed extra heat this is evidently a new change in the climate system that we don’t know the mechanics of and could go on for centuries. In geological timescales it would be completely unsurprising if this effect goes on long after effectively carbon free energy sources are developed. So there should still be a reasonable presumption given the data that the models are hot, and that the climate system may have also changed to damper surface temperatures down from the not so alarming projections we currently have.

The UN projects warming costs of 2 to 4% of GDP in 2100. It’s obvious that the cost analysis has been very biased towards finding potential costs rather than benefits. It’s also likely that the projected temperatures are high either because the models are hot or because the ocean is now absorbing more heat than before 2000.

In short, the least scientific analysis is to be alarmist given current data, and that’s even if you were to go with the UN’s own projection of costs which assume the models are correct and which have an obvious bias to assuming higher costs than benefits.

Stevek

Reply to Robbie

September 21, 2019 8:44 am

Yes I agree that if this heat is going into the deep oceans then the ocean provides a buffer of perhaps hundreds of years and warming of atmosphere then will not be a problem at all. Oceans have a massive capacity to store heat.

John Q Public

Reply to Stevek

September 21, 2019 4:13 pm

Ahh. But then they will scare you with the horror of melting methane ices, which are an 84X more potent “greenhouse” gas than CO2 (short term).

Tim Gorman

Reply to Robbie

September 21, 2019 2:11 pm

How does the heat get into the deep ocean without first transiting the ocean surface? Wouldn’t the ocean surface warm first? Or is there some kind of Star Trek transporter somewhere sending the heat directly into the deep ocean?

Robbie

Reply to Tim Gorman

September 21, 2019 2:32 pm

I don’t know, and people don’t seem to talk that much about the ‘missing heat’. At least I recently read a RealClimate discussion of the satellite temperature data that concluded that the models are probably running a little hot in sensitivity (feedback effects), while assuming the basic forcings were accurate. They didn’t say a word about ocean heat.

As I recall the reason it was only found about halfway into the surface temperature pause was that sensors for the deeper ocean were not widespread before that so there was basically no data.

kwinterkorn

Reply to Tim Gorman

September 22, 2019 2:55 am

As I am sure you know, about 70% of the Earth’s surface is ocean…therefore the majority of heat energy from sunshine first heats the ocean surface. We know that the convection of heat in the ocean is not just around the globe at the surface via currents, but also up and down, from surface to the depths, through vertical components of the currents. So yes, the heat is first at the surface, not counting volcanic sources, and then either rises via evaporation or IR emission upward into the atmosphere or distributes downward due to the movement of the water. My understanding is that the flow of heat up and down occurs on a timescale of years and decades, but it nevertheless significant

Tim Gorman

Reply to kwinterkorn

September 22, 2019 7:35 am

“As I am sure you know, about 70% of the Earth’s surface is ocean…therefore the majority of heat energy from sunshine first heats the ocean surface.”

That’s *exactly* what I said. The problem is that, according to NASA at least, the ocean surface has been cooling since 2003. So how did the “excess” heat since 2003 get into the deep oceans? Via a Star Trek transporter somewhere?

The claim that the global warming “hiatus” happened because the heat since the early 2000’s is hiding in the deep ocean is ludicrous unless someone can explain how a cooling ocean surface temperature is consistent with conducting heat into the deep ocean!

MarkW

September 21, 2019 8:01 am

“(2) The observations are running too cool. ”

If there is something in the real world that is cooling temperatures, but this something is not included in the models, then this is a variation of (1), the models are running hot. (If there is something missing from the models, this a flaw in the models.)

The only time observations could be running cool is if there is some error in how observations are being taken that make the recorded temperatures lower than the actual temperatures.

Kevin kilty

Reply to MarkW

September 21, 2019 8:35 am

At first blush it does seem that 2) is just the contrapositive of 1), but upon further reflection you might see that 2) refers to what we might call “wrong model” bias. 1) represents too much CO2 gain in models, while 2) leaves out real negative feedbacks and other influences. In any event some people in the climate science field make a distinction between 1) and 2), so I do as well.

MarkW

Reply to Kevin kilty

September 21, 2019 10:20 am

Seems to me that defining 2) this way is a good way to hide half of the model failures.

If we include too much warming, it’s a model failure.
If we don’t include enough cooling it’s a problem with the data.

The people in the climate science field do many things that aren’t justifiable, this appears to be another one of them.

beng135

September 21, 2019 8:02 am

I know Joe Bastardi has mentioned constantly that at least the American weather model forecast runs notoriously hot and cannot “see” cold air a week or more in the future. Telling that it can’t see cold but sees plenty of heat. Just a coincidence huh…..

Muppets are us

Reply to beng135

September 22, 2019 1:33 am

Met office forecasts are the same, they have stopped releasing the long range forecasts as they got embarrassed by then being so wrong, normally on the hot side.

commieBob

September 21, 2019 8:03 am

There is the chance that the satellite temperature measurements accurately reflect physical reality. All the statistics in the world can’t confirm that or repudiate it.

A C Osborn

September 21, 2019 8:11 am

“A trend across our control chart toward higher values suggests a worrying problem; a trend across the chart toward lower values suggests otherwise.”

But not in the case of Climate, down is disaster whereas up is getting back to normal.

Jim Gorman

September 21, 2019 8:21 am

I have come to the conclusion that few folks involved in climate science are indoctrinated in physical experimental science. They have become mathematicians, statisticians, and programmers that are removed from the real world and consequently their bias lies toward getting the (unreal) answer they want from the models rather than getting a real and repeatable projection of the world as it truly is.

Kevin kilty

Reply to Jim Gorman

September 21, 2019 8:58 am

I have mentioned on a few earlier threads that my university began teaching a probability and statistics course for new hires in engineering and the sciences because we found that many people were coming out of graduate school poorly grounded in experimental methods–ironically, those people better prepared turned out to be from the college of education because they had taken a research methods class or two. I suppose the thinking was that engineers and scientist should be able to teach research methods to themselves, but what actually happened is that research methods became an “unknown unknown”. They had become focused on teaching the theoretical aspects of their discipline. It is difficult to decide to teach yourself about some you are utterly unaware of.

Clyde Spencer

Reply to Jim Gorman

September 21, 2019 9:25 am

Jim Gorman
Yes, the oft repeated claim that GCMs are based on physics is fluff. The parameterization and tuning turns them into curve-fitting exercises that match the adjusted (increasing slope) temperature histories.

commieBob

Reply to Jim Gorman

September 21, 2019 10:49 am

They have become mathematicians, statisticians, and programmers …

You can have a PhD in Engineering but not qualify for a license to practice engineering. If your undergrad degree wasn’t in Engineering, you’ll be missing basic skills and knowledge and that will lead you into grave errors.

What we’re seeing is scientists applying statistical tools (eg. from Matlab etc.) without understanding the basics. That means they don’t even know what assumptions they are making.

There should be a rule that, if your paper has any statistical analysis, you must have a statistician as a co-author.

Barbara

Reply to commieBob

September 21, 2019 4:57 pm

commieBob says, “There should be a rule that, if your paper has any statistical analysis, you must have a statistician as a co-author.”

I second this emotion, and would apply it rigorously to epidemiology and drug studies as well.

Jeff Id

Reply to Barbara

September 23, 2019 4:00 am

Michael Mann has a math degree.

Michael Jankowski

Reply to commieBob

September 21, 2019 6:24 pm

You’d have a hard time, if not an impossible one, getting a PhD in engineering in the US without satisfying the equivalency of ABET accreditation requirements. Most universities won’t grant a masters in engineering without some demonstration of equivalency. You shouldn’t be “missing basic skills and knowledge.” There are biannual continuing education requirements for license renewal which differ by state as well.

Beyond education, you must pass an 8-hr exam to become an engineering intern, acquire 4 yrs of relevant experience under a licensed professional engineer, and then pass an 8-hr licensure exam specific to your field of engineering as part of the process.

You also won’t be signing and sealing any drawings, calculations, or reports not related to your expertise or competence level (at least not without putting your license on the line and risking suspensions, fines, cancellations, and/or jail time), so there should not be any issues with “missing basic skills and knowledge.” And even if you tried…QA/QC processes at engineering firms tend to be rigorous. Sure, there are some cowboys out there, but usually old-timers who have enough experience that they are not “missing basic skills and knowledge” in their practice.

commieBob

Reply to Michael Jankowski

September 22, 2019 5:03 am

I’m well aware of the requirements necessary to become a professional engineer.

I have known folks with a PhD in engineering who didn’t have an undergrad engineering degree and who therefore would not qualify to become professional engineers.

My point was that it is possible to pick up advanced certifications while still lacking basic knowledge.

The other thing that gets up my nose is the folks with advanced certifications who just assume they understand stuff about which they demonstrably have no clue.

Nick Stokes

Reply to commieBob

September 21, 2019 9:35 pm

My institution actually had that rule, for a while. I was in the Division that was supposed to provide the statisticians. It was, IMO, a disaster. The authors were often at cross purposes, and it was rarely the statisticians who were right.

commieBob

Reply to Nick Stokes

September 22, 2019 5:16 am

There’s a big problem communicating between silos.

It’s like the problem communicating between Engineering and Manufacturing, and nobody around who could speak the two languages well enough to translate.

Jim Gorman

Reply to commieBob

September 22, 2019 8:07 am

It’s more than just knowledge of statistics and how they are applied. It’s a lack of hands-on working experience with real, and I do mean real, physical measurements. Having to actually measure something and then try to repeat it. To deal with skewed data and find out the reasons for it. To really understand why and how measurements are adjusted and how it might affect your conclusions.

I’ll give you an example. We know data from various recording sites have been adjusted in order to make them “more compatable” for modeling. However, is the adjusted data for a single site correct to use in a small region for, let’s say, an insect study where temperature is important for reproduction?

I’m an old timer. How many folks today have tried to set up an analog meter to read a voltage? First you physically adjust the meter to read zero while using the mirror to eliminate parallax. Then you turn it on for a warm up time and then electrically “zero” the bridge again using the mirror. Finally you use the mirror to try to eliminate parallax when reading the measured value.

All these make you very aware of error in both reading and calibration. Digital instruments are nice but mislead newbies into believing they are exact and perfectly calibrated. Same thing with statisticians. A measurement is a number that is perfect and how it fits into a population of other numbers is the real question.

Tim Gorman

Reply to Jim Gorman

September 22, 2019 9:44 am

“All these make you very aware of error in both reading and calibration. Digital instruments are nice but mislead newbies into believing they are exact and perfectly calibrated. Same thing with statisticians. A measurement is a number that is perfect and how it fits into a population of other numbers is the real question.”

To expand further: If multiple measurements are made then the the errors in the multiple measurements will be random and the central limit theory can be used to determine a mean. The problem is that there is also an uncertainty factor associated with the instrument itself. That mean of a normal distribution may or may not be accurate. It may be precise but you don’t know the accuracy because the true value can be anywhere in the uncertainty interval. The uncertainty interval is not random and can’t be reduced by the central limit theory. It can only be reduced by comparing the measurement results with an external calibration standard. And even then some uncertainty remains. In a digital meter the oscillator driving the counter window can drift with age, temperature, humidity, etc. always leaving some uncertainty behind even after being recently calibrated against a standard (the standard being the physical world and not a computer model). When that residual uncertainty is used in a sequential manner, e.g. the first measurement stage drives a second measurement stage, etc, the uncertainty adds, it doesn’t stay static and it doesn’t reduce.

Annie

Reply to Jim Gorman

September 21, 2019 8:05 pm

As a statistician and data scientist I am very aware that there is little evidence of statistical understanding in climate science. Michael Mann is not unusual in using statistical methods out of a computer program with nil understanding of what they actually involve and of the basic assumptions behind them. In particular there is no understanding amongst climate modellers of the basic rule that the more you overfit a statistical model (aka tuning, aka taking advantage of chance variation in the sample) the less likely the model will generalise to other samples (ie the future). Mathematicians and physicists are NOT the same as statisticians.

mkelly

September 21, 2019 8:27 am

The starting point in SPC is that you know what the dimension you are measuring is supposed to be. We do not know what the correct temperature is supposed to be.

Kevin kilty

Reply to mkelly

September 21, 2019 8:41 am

True. In real process control we have at our disposal a real process to test and build a chart upon; and then we test exactly the same process sample by sample. In climate science we have only models to build the chart and compare them to observations which are different, of course–yet, the two seem similar in important ways.

Bob Hoye

September 21, 2019 8:59 am

The irony is that the warmists think they are forecasting temperature.
But all they doing is forecasting the rise in CO2 concentration.

TonyL

September 21, 2019 9:00 am

I want to open a can of worms, here.

If as Lenssen, et al, claim the 95% uncertainty bounds of the global mean temperatures are truly as small as 0.05C, then the spread in the various observations is, at times, unlikely itself.

But is it real?
The unspoken assumption here is that the data is Gaussian. What this means is the data set has a central tendency, or a mean value, or a tendency more or less steadily increasing or decreasing. Very important is the fact that deviations from said core values are “random” or “randomly distributed” or “Gaussian”. Then we base all our calculations on Gaussian statistics, including mean, standard deviation, 95% confidence intervals, and the revered “wee p values”. And so on.
But what happens when the data distribution is not “random” or Gaussian”. What if there are well known influences on the measured parameter that are absolutely *not* random. Like perhaps a train of El Nino – La Nina cycles. A huge Super El Nino right in the middle of the record. A weird aborted El Nino – Pacific Hot Pool – real El Nino at the end of the record. Not predicted, not necessarily well understood, but *not* Gaussian distribution.

Now what can we say about the chances of this, or of that, or in control, or out of control, or the chance of 0.5 degree difference?

Statisticians: Have at it!

Kevin kilty

Reply to TonyL

September 21, 2019 9:49 am

Well, you do open a can of worms. Petr Beckmann once called the tendency to over use the normal distribution to calculate uncertainty limits and other characteristics as “the Gaussian disease”. Let’s try to quantify things…

Assume mean Earth temperature is known as well as 0.05C to 95% confidence. Presume these researchers are using Student’s t because no one in right mind would use a Z-value — the variance is actually unknown and has to be calculated from the sample. Then the probability that two independent measures of mean temperature would vary by the 0.12C shown in Figure 2 in year 2010 is less than 50 in 1,000. It’s improbable, which is what SPC focuses on as an indicator of trouble.

Because we are already using a “fat tailed” distribution in our adoption of Student’s t, then what else should we consider? We might look at using statistics which arise from a 1/f process, but in this case the mean value depends on the time span chosen for observations and one has to use a Monte-Carlo method to find expected deviations, like doing things in financial engineering.

TonyL

Reply to Kevin kilty

September 21, 2019 10:17 am

Thanks for the cogent and interesting reply.

Geoff Withnell

Reply to TonyL

September 21, 2019 2:58 pm

The beauty of using x-bar charts is that they are averages. And the average of even extremely non-Gaussian distributions rapidly tends toward Gaussian, thereby allowing these techniques to be quite useful still. As a Senior Member of the American Society for Quality (the ASQ.org mentioned above) and Certified Quality Engineer (CQE), I have spent a lot of time using Dr Shewhart’s work. He was one of the founders of the society.

Tim Gorman

Reply to Geoff Withnell

September 21, 2019 3:06 pm

“he beauty of using x-bar charts is that they are averages. And the average of even extremely non-Gaussian distributions rapidly tends toward Gaussian, thereby allowing these techniques to be quite useful still.”

How can the average of a uniform distribution rapidly trend toward a Gaussian distribution? Every point in a uniform distribution has an equal probability making an average somewhat meaningless.

Nick Stokes

Reply to Tim Gorman

September 21, 2019 4:24 pm

“How can the average of a uniform distribution rapidly trend toward a Gaussian distribution?”
The average of points taken from a uniform distribution tends rapidly towards Gaussian. The average of two is a triangular distribution. The average of n is just the uniform distribution convolved with itself n times, or the B-spline distribution.
Here is the average of four – a Parzen window shape. Already pretty close to Gaussian.

Tim Gorman

Reply to Nick Stokes

September 21, 2019 8:55 pm

“The average of points taken from a uniform distribution tends rapidly towards Gaussian. ”

Huh? The average of points taken from a uniform distribution gives a median value but it doesn’t change the uniform distribution into a Gaussian distribution. The median of a Gaussian distribution is the point with the highest probability of occurrence. The median of a uniform distribution has the same probability of occurrence as all other points in the distribution. They are not the same!

“The average of two is a triangular distribution. The average of n is just the uniform distribution convolved with itself n times, or the B-spline distribution.”

Exactly! You have to *do* something (i.e. convolve two rectangular windows) to convert a rectangular window into a triangle window! Taking an average and finding the median value on the x axis is *not* convolving anything!

In essence what you have thrown out is nothing more than a red herring argument. Finding a median value and convolving two windows are *not* the same thing.

Jim Gorman

Reply to Nick Stokes

September 24, 2019 8:27 am

This is a pertinent discussion because it leads into the use of averages. Averages (or means) hide information. It is also a difference between a mathematician and an engineer that deals with the physical world.

Is a mean of 3.5 when throwing die meaningful in the real world? I can’t think of a reason that the mean describes anything physical about a roll of the die. One can’t even adequately assume the probability of throwing a given number from only this information. How about within one sigma? That encompasses 2, 3, 4, 5. What about 1 and 6? How about two sigma’s? That encompasses all the possible values. What about the other 5%? Can you assume this is a uniform distribution from the mean and standard deviation?

Ok, I make doors. They are nominally 80 inches and I advertise them that way. I take measurements and show that the mean is 80 inches so I’m not lying. However, the doors are in reality 79 inches, 80 inches or 81 inches. What does the mean tell you? What about 1 sigma? 2 sigma’s?

Information disappearing is one the primary arguments about using Tmax and Tmin to calculate an average just like the examples above. Does an average tell you about how many cooling days are involved? Does an average tell you about how many heating days are involved? This is important. If cooling days are going down, then Tmax is decreasing. If heating days are going up, then Tmin are going down. This is all information that is lost when doing performing statistical calculations.

Ed Bo

Reply to Tim Gorman

September 21, 2019 5:54 pm

Nick’s got it. For a specific example, consider tossing dice. One die has an even chance of 1 to 6. Two dice yields a triangular distribution centered at 3.5 average. It doesn’t take many more to get very close to Gaussian. It’s easy to try for yourself.

Tim Gorman

Reply to Ed Bo

September 21, 2019 9:05 pm

Nick has nothing! There is no tendency for a uniform distribution to all of a sudden become a triangle distribution or Gaussian distribution! You had to *add* a second dice in order to make that happen!

You have to convolve two rectangular windows (i.e. the same uniform distribution with itself) to get a triangle distribution. As the first window slides across the second one the area that is under both goes from zero when they first intersect to a maximum as the two windows become congruent and then decreases to zero as the two windows no longer intersect.

How is this a tendency for a uniform distribution to become Gaussian? An uncertainty interval with a uniform distribution doesn’t magically become a Gaussian distribution where the median value is the highest probability. That’s just magical thinking useful only as a rationale to try and dismiss Pat Frank’s uncertainty analysis!

Nick Stokes

Reply to Ed Bo

September 21, 2019 9:32 pm

“You had to *add* a second dice”
No, you can throw the same dice twice. I’ve no idea what you are on about here. The central limit proposition is simple. If you add numbers taken from any distribution, or even a mixture, the result becomes more like a normal distribution with the number in the sample. Averaging just scales by 1/N.

Tim Gorman

Reply to Nick Stokes

September 21, 2019 9:52 pm

“No, you can throw the same dice twice. I’ve no idea what you are on about here. The central limit proposition is simple. If you add numbers taken from any distribution, or even a mixture, the result becomes more like a normal distribution with the number in the sample. Averaging just scales by 1/N.”

UNCERTAINTY IS NOT RANDOM AND IT IS NOT ERROR. The central limit theorem only works with a random distribution where values vary – i.e. measurement error of a single sample with measurements taken multiple times. The uncertainty propagation is not like throwing a dice multiple times. Have you read nothing that Dr. Frank has posted?

Why do you continue to try and troll this argument? The average of n …. infinity is 1 if n=1. You can add as many 1’s as you want or extract as many 1’s as you want but the distribution will never become more like a normal distribution! No amount of scaling will change that fact.

Ed Bo

Reply to Ed Bo

September 21, 2019 10:10 pm

Tim: You are obviously completely unfamiliar with the Central Limit Theorem, which you should have learned in the first couple of weeks of an introductory statistics class.

You seem to be missing Geoff’s point completely. These charts plot a set of averages of N points each. It is indeed true, by the CLT, that regardless of the underlying distribution, the distribution of a set of the average of N (independent) samples tends toward Gaussian. (Technically, in the limit as N -> infinity, you get a Gaussian distribution.

If you want to think of it in terms of your convolutions, keep convolving your resulting distribuition with another square wave, your result will get closer and closer to Gaussian.

Do you really believe that if you plotted the results of set of averages of 100 dice tosses each, you would not get a Gaussian distribution???

Tim Gorman

Reply to Ed Bo

September 22, 2019 7:13 am

“You are obviously completely unfamiliar with the Central Limit Theorem, which you should have learned in the first couple of weeks of an introductory statistics class.”

When you have to resort to an ad hominem attack you have already lost the argument.

“These charts plot a set of averages of N points each. It is indeed true, by the CLT, that regardless of the underlying distribution, the distribution of a set of the average of N (independent) samples tends toward Gaussian. (Technically, in the limit as N -> infinity, you get a Gaussian distribution.”

Not for a uniform distribution. No matter how many samples you take of a population with a uniform distribution your mean will always the same. The average of those means will always be the same. If the uniformity is normalized to a value of 1 then any single sample will also be one. The average of any group of samples will always be one. The average of any number of sample groups will always be one. The standard deviation will always be zero. No matter what you do you will unable to come up with a normal, i.e. Gaussian, distribution of values.

“Do you really believe that if you plotted the results of set of averages of 100 dice tosses each, you would not get a Gaussian distribution???”

Of course you would. The issue is that each individual dice roll gives a random value. This applies to a distribution of errors as well. But this is all a red herring when it comes to calculating uncertainty. UNCERTAINTY IS NOT A MEASURE OF ERROR. It doesn’t have a random distribution of any kind. It tells you the interval in which the truth lies, not what the truth is.

PS – convolution of a uniform distribution was not my idea, it was Nicks. And it doesn’t apply either in the case of uncertainty.

Ed Bo

Reply to Ed Bo

September 22, 2019 8:05 am

Tim – You say: “When you have to resort to an ad hominem attack you have already lost the argument.”

No, it was a logical deduction based on your arguments. And your subsequent arguments only confirm the deduction to a higher degree.

If you objected to Geoff’s very valid point, you cannot really understand the CLT and its applicability.

You may have misunderstood the point Geoff was trying to make. Like Geoff, I have done many of these charts over the years, and his point as far as it goes is completely valid. In objecting to it for whatever reason, you make yourself look very foolish.

Tim Gorman

Reply to Ed Bo

September 22, 2019 9:30 am

“No, it was a logical deduction based on your arguments. And your subsequent arguments only confirm the deduction to a higher degree.

If you objected to Geoff’s very valid point, you cannot really understand the CLT and its applicability.”

I note that you did not actually refute my assertions about 1. the uniform distribution and 2. the difference between error and uncertainty.

“You may have misunderstood the point Geoff was trying to make. Like Geoff, I have done many of these charts over the years, and his point as far as it goes is completely valid. In objecting to it for whatever reason, you make yourself look very foolish.”

The point (as far as it goes) is 1. wrong and 2. inapplicable in this case, it is a red herring. Again, uncertainty is not error. The only one here that is looking foolish is you because you can’t refute any assertion I have made.

Jim Gorman

Reply to Ed Bo

September 22, 2019 1:17 pm

What Tim is trying to explain is a distribution that looks like a square wave. In other words, every value in the interval has the same frequency of occuring, i.e., “1”. If you plot the frequency of occurrence of each number of a single die, you will find they tend toward equal frequencies and therefore a flat distribution.

That is a perfect illustration of uncertainty. That is, any value in the interval is equally possible. Pick any integer number from the die, any number, and tell why it is more likely to occur than any other number.

Using statistical calculations to attempt come up with descriptions of the distribution won’t tell you a thing. First, you’re dealing with integers and the mean is not an integer – meaningless.
That is what over thinking statistics will do.

If you end up with a distribution like this in SPC you need to evaluate what you are measuring and with what. You won’t be able to recognize a process approaching out of control until all of a sudden it is.

Ragnaar

Reply to Ed Bo

September 22, 2019 3:04 pm

What would help here is to plot the distribution of the propogation of errors. Not bound it. That’s already been done. Bounds without a distribution are not helping me. And if it is a propogations errors, follow through and give us a distribution or at least a typical distribution. Or, say that no distribution of propagated errors can exist.

Ragnaar

Reply to Ed Bo

September 22, 2019 3:07 pm

Correction to my above:

And if it is a propogation of errors,

Ed Bo

Reply to Ed Bo

September 22, 2019 6:24 pm

Tim – Wow, just wow! Not only did I refute your point, you AGREED that I refuted it. (Although you apparently didn’t — and still don’t — understand this.)

Let’s review. Geoff made the very simple observation that one of the nice things about these SPC charts is that, by using averages consisting of multiple individual samples each, these averages tend toward Gaussian distributions, even if the underlying distribution is NOT Gaussian.

While a side issue to the main point of the original post, this is an absolutely correct observation that anyone who has understood a very basic statistics course will recognize — it is a simple illustration of the Central Limit Theorem.

After you objected to Geoff’s asseration, Nick and I explained it more carefully. I brought up the example of dice toss results, where the distribution of each individual toss is uniform, but the distribution of sets of N tosses is not.

I specifically used the example of sets of N=100 averages, asking if the distribution of these averages would be Gaussian. You answered: “Of course you would.”

That is all that Geoff, Nick and I have been arguing. That’s it! And you agree!

None of us made the argument that the mean would change, as you seem to think we have.

None of us made any assertion about error versus uncertainty, as you seem to think we have.

I’m sorry, but you remain completely confused about a very basic statistical concept.

Tim Gorman

Reply to Ed Bo

September 23, 2019 8:36 am

“Let’s review. Geoff made the very simple observation that one of the nice things about these SPC charts is that, by using averages consisting of multiple individual samples each, these averages tend toward Gaussian distributions, even if the underlying distribution is NOT Gaussian.”

As Jim Gorman tried to explain to you this just isn’t the case. If you roll a die 1000 times the probability for each side of the dice coming up is 1/6. The probability for a one is 1/6. The probability for a 6 is 1/6. So you have a 1000 rolls. You can take 100 groups of 100 samples out of those 1000 rolls, all equal to 1/6, average each group to get [100*(1/6)]/100 = 1/6, and then average the average of each group. You will wind up with 1/6 , a constant, not a Gaussian or even near-Gaussian distribution.

“While a side issue to the main point of the original post, this is an absolutely correct observation that anyone who has understood a very basic statistics course will recognize — it is a simple illustration of the Central Limit Theorem.”

It is *NOT* simple at all. As I just showed you. In order to use the central limit theory to generate a near-Gaussian distribution you must have a random distribution of some kind to begin with! If you have a uniform distribution no amount of sampling and averaging will give you anything but the same uniform distribution!

“After you objected to Geoff’s asseration, Nick and I explained it more carefully. I brought up the example of dice toss results, where the distribution of each individual toss is uniform, but the distribution of sets of N tosses is not.”

You claimed this but you never proved it. As I show above no amount of sampling and averaging will change a uniform distribution into anything other than a uniform distribution. Show me how my math is wrong in the above example!

“None of us made the argument that the mean would change, as you seem to think we have.”

First, I haven’t agreed with anything you or Nick have said on this subject. Second, there is *NO* mean in a uniform distribution!

“None of us made any assertion about error versus uncertainty, as you seem to think we have”

Then what is the purpose of trying to confuse the issue? Are you just trolling the thread to see what kind of confusion you can sow?

“I’m sorry, but you remain completely confused about a very basic statistical concept.”

I’m sorry. The only one confused here is you. A uniform population has no random distribution that can be projected into a Gaussian distribution using the central limit theorem. That would mean that some of the population would have to have a different probability of occurrence than the other members, which is exactly what a uniform distribution is *NOT*.

Geoff Withnell

Reply to Ed Bo

September 23, 2019 3:44 am

Actually, to a mathematician, the result of averaging a uniform distribution is a limiting case of Gaussian distribution. As the variation in the distribution reduces, so does the standrd deviation. The result o averaging 3 samples of a uniform distribution with members having the value of “10” would be a gaussian distribution with a mean of 10 and a standard deviation of “0”.

Jim Gorman

Reply to Ed Bo

September 23, 2019 1:20 pm

Geoff,

A standard deviation of zero is not a Gaussian distribution. It is a uniform distribution

Ed Bo

Reply to Ed Bo

September 23, 2019 9:03 pm

Tim — I cannot believe that you are seriously arguing these points!

You say: “In order to use the central limit theory to generate a near-Gaussian distribution you must have a random distribution of some kind to begin with! If you have a uniform distribution no amount of sampling and averaging will give you anything but the same uniform distribution!”

You are arguing that random distributions and uniform distributions are mutually exclusive. But the very example at hand — that of the toss of a die — is both random (you don’t know what value you are going to get on any particular toss) and uniform (you have an equal chance of each possible value).

In fact, it is THE textbook example of such a distribution, as anyone with even a passing acquaintance with basic statistics would know.

Then you say: “no amount of sampling and averaging will change a uniform distribution into anything other than a uniform distribution. Show me how my math is wrong in the above example!”

OK, I’ll be glad to help! Let’s take the case of N=2 dice tosses. You have 6 times the chance of rolling a 7 (average 3.5) as you do of rolling a 2 (average 1.0) or a 12 (average 6.0). The distribution has changed from uniform (flat) for N=1 to triangular for N=2, as we keep explaining to you. Most seven-year-olds who play dice games realize this, at least qualitatively, but it is beyond you!

And you say: “there is *NO* mean in a uniform distribution!”

Hogwash! There IS a mean to any distribution. In the case of a die toss, the mean is (1+2+3+4+5+6)/6 = 3.5. Elementary school math.

These are basic, basic points, but you are complettely unable to grasp them.

Tim Gorman

Reply to Ed Bo

September 24, 2019 4:47 am

Ed,

“You are arguing that random distributions and uniform distributions are mutually exclusive”

That is *not* what I am saying at all! The chances of rolling any specific integer on the die is random, but the probability of all the integers being rolled is equal. No standard deviation at all, its ZERO. All probabilities are equal to 1/6. When you graph the number rolled on the x-axis and the probability on the y-axis you get a rectangle. You can select as many samples of arbitrary size out of the population as you would like, average each sample and then average the averages. You will wind up with the very same rectangle. There will be no standard deviation at all.

“OK, I’ll be glad to help! Let’s take the case of N=2 dice tosses. You have 6 times the chance of rolling a 7 (average 3.5) as you do of rolling a 2 (average 1.0) or a 12 (average 6.0). ”

You just proved my point. You generated a non-uniform distribution by using two dies and taking their sum!

“The distribution has changed from uniform (flat) for N=1 to triangular for N=2, as we keep explaining to you”

ROFL!!! No, that is *not* what you’ve been asserting. You have been asserting that a uniform distribution can be changed into a Gaussian distribution using the central limit theory! Then you try to prove it by generating a non-uniform distribution to work with!

“Hogwash! There IS a mean to any distribution. In the case of a die toss, the mean is (1+2+3+4+5+6)/6 = 3.5. Elementary school math.”

How do you roll a 3.5 on a die? Do you cut off the corners so it can land halfway in between two numbers? If the mean value is an impossibility then is it actually a mean? You’ve just demonstrated the main difference between a mathematician or computer programmer and an engineer! The mathematician/computer programmer doesn’t care if what they calculate has any relationship to the real world! By Jimminy they calculate an answer and it is precise and accurate and that is the end of it!

Now, instead of trying to prove me wrong about a uniform distribution by starting with a non-uniform distribution why don’t you stop and think about it for minute?

Nick Stokes

Reply to Ed Bo

September 23, 2019 10:05 pm

“These are basic, basic points, but you are completely unable to grasp them.”
Well put!

Tim Gorman

Reply to Nick Stokes

September 24, 2019 4:52 am

“Well put!”

You are as clueless as Ed! You truly believe that you can prove that a uniform distribution can be changed into a non-uniform distribution by using a non-uniform distribution as the start of your proof?

I’ll ask you the same question as Ed – can *YOU* roll a 3.5 on a die? If 3.5 doesn’t exist in the population under question then how can it be a mean, at least in the real world.

Nick Stokes

Reply to Ed Bo

September 24, 2019 5:09 am

“If 3.5 doesn’t exist in the population under question then how can it be a mean, at least in the real world.”
I can only echo Ed
“These are basic, basic points, but you are completely unable to grasp them.”
Unbelievably dumb.

Tim Gorman

Reply to Nick Stokes

September 24, 2019 6:07 am

Nick,

Tim: ““If 3.5 doesn’t exist in the population under question then how can it be a mean, at least in the real world.”
Nick: I can only echo Ed
“These are basic, basic points, but you are completely unable to grasp them.”
Unbelievably dumb.

In other words you have no answer. I didn’t think you would. Tell me again what is unbelievably dumb?

Trick

Reply to Ed Bo

September 24, 2019 12:11 pm

Tim 6:07am, it appears to me you are using the word “mean” incorrectly when you are writing as if “median” is your intent. There is no median 3.5 in the possible results of a single normal die throw but there exists a mean 3.5 of enough throws. Arithmetic average (3.5) is the same as arithmetic mean (3.5).

Tim Gorman

Reply to Trick

September 24, 2019 2:15 pm

“im 6:07am, it appears to me you are using the word “mean” incorrectly when you are writing as if “median” is your intent. There is no median 3.5 in the possible results of a single normal die throw but there exists a mean 3.5 of enough throws. Arithmetic average (3.5) is the same as arithmetic mean (3.5).”

Let’s suppose you are on death row and warden tells you that if you can throw either a 3.5 or a 1, pick one, with these die he’ll commute your death sentence.Which one would you pick? Do you think you can roll a 3.5? That would be the most likely outcome, right?

If the arithmetic mean and average isn’t the most likely outcome then of what use is it? If it can’t inform your decision making then why even calculate it? If the probability of 3.5 happening on a roll is zero then how do you even plot it on a probability graph?

Trick

Reply to Ed Bo

September 24, 2019 2:50 pm

”If the probability of 3.5 happening on a roll is zero then how do you even plot it on a probability graph?”

Consider a set of values Xj (j =1 ,2,…,N) of anything (temperature, pressure, height, weight, etc.).

The most probable value (or mode) is deﬁned as the value of X that occurs most frequently, this is what you are looking for. The median, deﬁned as that value of X such that 50% of the values in the set lie above it and 50% below is what you were trying to debate. Mean and average are defined the same (here 3.5) as pointed out above in this sub-thread & as you point out, can be misleading, if not outright dangerous if the context is ignored.

Still, these are three equally acceptable ways of characterizing a set of values by a single number. That hardly scratches the surface of the universe of averages or means. The arithmetic average is an equally weighted average, and there is no reason why we cannot weight values differently depending on some expectation of their relative importance.

Tim Gorman

Reply to Ed Bo

September 24, 2019 3:24 pm

Trick,

Perhaps it’s just the engineer in me. I have a hard time defining either the mean or the median as a number that doesn’t exist in the population being examined. If the probability of that number is zero, i.e. it doesn’t exist in the population, then I have little use for it. I would rather say that 3 and 4 represent a median “pair” in the data set. Half the other members of the population above and below the pair. That makes far more physical sense to me and it is something I can make use of. That’s going to be true for any set of integers with an even number of members in the set.

Ed Bo

Reply to Ed Bo

September 24, 2019 4:08 pm

Tim – You continue to amaze.

The mean of a data set is a mathematical concept. It is a numerical value DEFINED AS the sum of all of the values in the data set divided by the number of individual values. PERIOD!

There is NOTHING in any definition of the mean that requires the value of the mean to correspond to any individual value in the data set. I defy you to find one and provide it.

So what is the importance in the real world of stating that the mean value of the distribution of individual tosses of a die is 3.5? Let’s say you are a casino trying to evaluate whether a die is fair or not. So you toss it 100 times. One requirement for it to be fair is that the sum of the values of the 100 tosses be close to 350 — the (expected) mean of 3.5 times 100. If it is close to 300 (3 * 100) or to 400 (4 * 100), which use possible individual values, you would reject the die as biased. This is a use of the mean of 3.5 in the real world.

By your logic, a data set consisting of the sum of two dice throws could have a mean (7), but a data set consisting of one-toss values (or the sum of three tosses) could not. It is simply a ridiculous argument.

But it’s even worse than that for your argument. Technically, we are talking about the “expected value” of the mean. Let’s say we do an actual experiment with 100 tosses of two dice each. The expected value of the mean of the tosses is exactly 7.0, but we are unlikely to get exactly that result, even if the dice are “fair”. We may get an actual mean of 6.93, with a total value of 693 over the 100 tosses. This is a perfectly valid use of the mean, even though an individual toss of 6.93 is obviously impossible. In fact, virtually every actual experiment of this case would yield an “impossible” non-integer result.

You also say: “The chances of rolling any specific integer on the die is random, but the probability of all the integers being rolled is equal. No standard deviation at all, its ZERO.”

So not only do you not understand the mean, you do not understand the very concept of a standard deviation. The standard deviation of a data set is DEFINED AS the square root of the average of the squares of the residuals of the data set about the mean of that set. So for a toss of a single die, we have:

SD = sqrt{[(1-3.5)^2 + (2-3.5)^2 + (3-3.5)^2 + (4-3.5)^2 + (5-3.5)^2 + (6-3.5)^2] / 6}

SD = sqrt{[6.25 + 2.25 + 0.25 + 0.25 + 2.25 + 6.25] / 6}

SD = sqrt{17.50 / 6} = 1.708

As above, I defy you to provide a reference that says that the standard deviation of the data set [1 2 3 4 5 6] is zero. (The only data set with an SD of zero is one with every data point having the same value.)

If you don’t even understand the concepts of mean and standard deviation, then the central limit is probably far beyond your grasp. But I will give it one more try, mostly for observers here.

The CLT concerns the difference between the probability distribution of individual data points (often called the underlying distribution) and the probability distribution of data points that are sums (or averages — it does not matter) of multiple individual points.

The CLT states that no matter what the shape of the underlying distribution, the probability distribution of a set of summed/averaged points of N individual points each increasingly tends toward a Gaussian distribution as N increases.

We have been using the example of a die toss as a distinctly non-Gaussian underlying distribution — one with a rectangular distribution. We have shown that for N=2, the distribution of the sums/averages of two dice becomes triangular instead of rectangular. We have also put forward the case that for N=100, the distribution is almost perfectly Gaussian.

You have agreed that these results are valid, yet somehow cannot understand that these are the very essence of what the CLT states. And that is ALL that Geoff, Nick, and I have been pointing out throughout this. But your conceptual foundation is so weak, you cannot follow these basic arguments.

When you don’t understand the very concepts of mean, standard deviation, or the central limit theorem, it is really difficult to take seriously anything you say.

Tim Gorman

Reply to Ed Bo

September 24, 2019 6:08 pm

Ed,

I am sincerely tired of you telling me that I don’t grasp what is going on with the roll of a single die. *YOU* haven’t even been bright enough to grasp that this is *not* a uniform distribution function. I only called it that to simplify it for other readers. To calculate the mean and the standard deviation you need to have a continuous probability density function in order to find the integrals that define the mean and standard deviation. The rolls of a single die do not provide a continuous function, it is a series of impulses. There are methods to handle those but they are beyond what I have used in the past. As another poster noted, you have to be *very* careful when you assume a non-continuous probability function is a continuous probability function. It’s that assumption that leads to numbers that don’t actually exist in reality.

Since none of this has anything to do with the propagation of uncertainty in the climate models I am very reticent to continue this sub-thread. You can whine about that all you want. I am doubtful you will admit that non-continuous functions and continuous functions are different so any more explanation from me would be wasted.

Ed Bo

Reply to Ed Bo

September 24, 2019 8:20 pm

Tim:

I asked you to provide any reference for your assertion that the mean value of a data set must be a possible member of that data set.

You did not! (Hint: You cannot!) And my argument is backed up by every textbook in existence.

I asked you to provide any reference for your assertion that a uniform randomized distribution has a standard deviation of zeero.

You did not! (Hint: You cannot!) And my argument is backed up by every textbook in existence.

I challenged you to demonstrate why our examples of summing/averaging multiple individual samples changing the distributions toward Gaussian are not basic examples of the Central Limit Theorem.

You did not! (Hint: You cannot!) And my argument is backed up by every textbook in existence.

Until you can grapple with these basic issues, you cannot even begin to handle the subtleties of the differences between continuous and discrete probability distributions

You say: “There are methods of handling those [discrete distributions] but they are beyond what I have used in the past.”

I have had a very successful multi-decadal career converting industrial systems from continuous to discretized control, so I work with these issues every single day of my professional career. And based on my academic and professional experience in the field, I can tell you two things:

First, none of the differences between continuous and discrete distributions make a bit of difference for what we have been discussing here, and…

Second, you have no freaking clue what you are talking about on these issues!

Look, I don’t mind if you are clueless on these topics yourself, but don’t go around confusing others who are looking here to learn things.

Tim Gorman

Reply to Ed Bo

September 25, 2019 9:12 am

“I asked you to provide any reference for your assertion that the mean value of a data set must be a possible member of that data set.”

I’ll answer you one more time. Perhaps this time you’ll listen. If the mean doesn’t exist then it is useless. Some examples. In the scrap yard I have a pile of steel girders, some of lenght 20′ and some of length 50′. The mean is 35′. Not a single girder of length 35′ exists. Knowing the mean does me absolutely no good in figuring out how many of each I would need to span a specific project. If the mean doesn’t exist then the standard deviation doesn’t either. If I take a population made up of leprechauns, average height 4′, and Watusis, average height 6′ 5″ I get a mean of 5′ 2.5″. Assume I am making coats for that population. First, that mean height probably does not exist in the population. If I make coats based on the mean, assuming that is the most probable result, the coat will be too big for the leprechauns and too small for the Watusis. It will fit almost no one because the mean doesn’t exist.

A mathematician/computer programmer will say “Of course you can calculate the mean even if it doesn’t exist” because they have no buy in to reality. Me, an engineer, will say “If the mean doesn’t exist then of what use is it?”.

And I note that *you* never answered my question about the die and the warden. Is that because your answer would be embarrassing?

“I asked you to provide any reference for your assertion that a uniform randomized distribution has a standard deviation of zeero”

If the mean doesn’t exist then neither does the standard deviation. Pretty obvious to an engineer!

“I challenged you to demonstrate why our examples of summing/averaging multiple individual samples changing the distributions toward Gaussian are not basic examples of the Central Limit Theorem.”

If the mean value doesn’t exist then its probability has to be zero. Just how does a mean with a zero probability represent a Gaussian distribution? A Gaussian distribution would have the mean as the point of maximum probability.

“Until you can grapple with these basic issues, you cannot even begin to handle the subtleties of the differences between continuous and discrete probability distributions”

I *have* dealt with them, multiple times. I’ve given you example after example, including the examples above. I’ll ask again, how can a mean value that doesn’t exist, which means it has a zero probability, be part of a Gaussian distribution? The answer is that it can’t. The central limit theory has created a lie. You don’t seem to care. Mathematicians and computer programmers don’t care.

“I have had a very successful multi-decadal career converting industrial systems from continuous to discretized control,”

ROFL! And once again you use an example that is an apple to an orange. How many times have you converted a discretized system into a continuous one? What process did you use to come up with the continuous function describing a discrete set of values?

“First, none of the differences between continuous and discrete distributions make a bit of difference for what we have been discussing here, and…”

Sure they do. Except to someone that doesn’t care about reality. Tell me again, which value would you pick to save your life? 3.5 or 1?

“Second, you have no freaking clue what you are talking about on these issues!”

*YOU* are the one that keeps using analogies and examples that don’t apply! That’s a pretty good hint of who doesn’t have a freaking clue. Tell me again, which die value would you pick to roll in order to save your life? 3.5 or 1?

“Look, I don’t mind if you are clueless on these topics yourself, but don’t go around confusing others who are looking here to learn things.”

You haven’t demonstrated that you’ve learned anything yet. You apparently still believe that error and uncertainty are the same thing! And you apparently also believe you can have a Gaussian distribution with a mean having a probability of zero. And my guess is that you still believe you can roll a 3.5 with a single die.

Trick

Reply to Ed Bo

September 25, 2019 10:51 am

Tim 9:12am still incorrectly writes: ”If the mean doesn’t exist then it is useless.”

Tim should correctly write if the median doesn’t exist then the median is useless – as in a throw of a fair die.

Though, yes Tim, your point that one number characterizing a distribution can be dangerous; shoe size for example. A shoe manufacturer needs to know the entire distribution of sizes, not just one number obtained from the distribution – median size, avg. or mean size, mode, weighted avg. size etc.

In the context of this blog, consider a region of this planet dominated by marine boundary layer stratus and cirrus. The distribution of cloud-base heights would be peaked around heights in the boundary layer and near the tropopause. Averaging these cloud-base heights would produce a mean value far from both peaks in the distribution of heights.

Global median temperature is another example of a dubious concept. Inﬁnitely many median temperatures are possible in nature, and each one is different.

Despite Greta Thunberg’s repeated implied use of this one number in speeches, a single number for this entire planet cannot possibly capture the consequences of temperature changes to human et.al. health, wealth, and happiness. As with shoe sizes, one needs the entire distribution, which in this context means everything related to weather: spatial and temporal distributions of temperature, precipitation amount and distribution in time and space, winds, duration, timing, and strength of storms etc.

If you live in Minneapolis and were to choose your clothing every day on the basis of Greta Thunberg’s latest advice relating to global median temperature, you’d likely be uncomfortable & wet much of the time or possibly even perish.

Tim Gorman

Reply to Trick

September 25, 2019 7:44 pm

“Tim should correctly write if the median doesn’t exist then the median is useless – as in a throw of a fair die.”

The MEAN of a single die is 3.5. It is also the median. Yet that MEAN/median has a probability of zero. The mean is supposed to be the most probable outcome in a Gaussian distribution. Yet the mean for a single die is not. You can never convert the discrete distribution function for a single die into a Gaussian distribution using the central limit theorem because there will be a big hole in the middle of the function right at the mean. So physically the mean calculated for a single die is useless, at least to me.

“Though, yes Tim, your point that one number characterizing a distribution can be dangerous”

You just said a mouthful!

“If you live in Minneapolis and were to choose your clothing every day on the basis of Greta Thunberg’s latest advice relating to global median temperature, you’d likely be uncomfortable & wet much of the time or possibly even perish.”

Yep. A perceptive point. Is it global warming if it isn’t warming globally?

SocietalNorm

Reply to Tim Gorman

September 25, 2019 8:03 pm

The mode is a statistical term that refers to the most frequently occurring number found in a set of numbers. The mode is found by collecting and organizing data in order to count the frequency of each result. The result with the highest count of occurrences is the mode of the set, also referred to as the modal value.

From Investopedia

I thought we all learned that in high school.

Tim Gorman

Reply to SocietalNorm

September 25, 2019 8:31 pm

“the most frequently occurring number found in a set of numbers. ”

And when there is no mode because all numbers appear equally?

“I thought we all learned that in high school.”

If all the numbers in the set are modes then are any of them modes? If so, then of what use it to know which ones are modes when they all are? And, better yet, what does it mean if, like the mean, the mode doesn’t physically exist, i.e. it has a zero probability of being real?

SocietalNorm

Reply to Tim Gorman

September 25, 2019 8:33 pm

Yes, there can be more than one mode.

Tim Gorman

Reply to SocietalNorm

September 25, 2019 8:49 pm

“Yes, there can be more than one mode.”

I already said that. You didn’t answer any of my question, especially what does it mean if all data points in the set are a mode. I didn’t think you would.

Trick

Reply to Ed Bo

September 25, 2019 9:29 pm

”The MEAN of a single die is 3.5. It is also the median.”

Strictly, no Tim. Look up the definitions for yourself. The roll of a fair die cannot result in the strict median of the set of possible outcomes which you incorrectly discussed above as the mean.

1) The arithmetic avg. (=arithmetic mean) of the set of values of anything Xj (1,2,3,4,5,6) is defined 3.5

2) Strictly, the median of the set of values Xj (1,2,3,4,5,6) does not exist in the set (50% above, 50% below that median value) so a kluge is used: middle values 3,4 are arithmetic averaged and you are back to 1) in that case where you then get a defined 3.5, the mean of set values 3,4.

Tim Gorman

Reply to Trick

September 26, 2019 5:34 am

” so a kluge is used: middle values 3,4 are arithmetic averaged and you are back to 1)”

If you’ll actually read the thread you’ll see this is what I already said!

This might be a good time to actually address the “metaphysical” concept associated with rolling a die. From an engineers viewpoint the whole issue is to identify how often each side of a die is facing up when it is rolled. This is done by marking each face of a die and then rolling it repeatedly while counting the number of times each face is pointing up. Conceptually, this has nothing to do with it being a random number generator. I could mark the die with the first names of people I know and accomplish the very same thing. And I would find that each face of the die comes up about 1/6th of the time.

Now, if I list those names horizontally (i.e. the x-axis) in any order you want and the probabilities vertically (i.e. the y-axis) then I defy you to calculate a mean from the names on the x-axis (assuming you are not into numerology).

I *can* take a mean of the probabilities associated with each name and it will be 1/6. From there I can calculate a variance and a standard deviation and they will both be 0 (zero). And voila!, I have the cube perfectly defined as far as the statistics associated with the faces on the cube, at least as far as an engineer is concerned. If the mean of the probabilities is not 1/6 or the variance and standard deviation of the probabilities is not zero then I know I have a cube that is not of homogeneous density which doesn’t roll “right”.

It is only when you make the invalid assumption that a 3d cube is a random number generator that things get fuzzy. It can just as easily be a die in an 8-ball with sayings on it like yes, no, maybe, try again, never, and always. If you shake the 8-ball enough times each of those sayings will have a 1/6th probability of coming up. And you can’t calculate a mean, variance, and standard deviation from yes, no, maybe, try again, never, and always without making the conceptual leap of assigning each a number which actually doesn’t exist in the world of an 8-ball die.

The 8=ball doesn’t care what a non-existent mean, mode, median, average, or whatever might be calculated. Neither do I.

P.S a die with different indecipherable runes marked on each face will work just as well on the crap table as one marked with dots on each face.

Trick

Reply to Ed Bo

September 26, 2019 7:20 am

I did read the thread or I couldn’t have commented. Here’s what Tim wrote 4:52am:

“If 3.5 doesn’t exist in the population under question then how can it be a mean, at least in the real world.”

3.5 CAN be a mean simply by definition of “mean”. Again, I point out what Tim meant to correctly write is there is no median value 3.5 in the possible results of a single fair die throw but there exists a mean 3.5.

The rest of Tim’s 5:34am comment is confusing mode (invoking “probability” term), mean and median definitions, I can’t make sense of it.

Ed Bo

Reply to Ed Bo

September 26, 2019 9:38 am

Tim:

You present such a target rich environment.

Your key argument is: If the mean doesn’t exist then of what use is it?”

It’s of lots of use. The fact that the mean of a single fair die toss is 3.5 tells me many things. With a symmetrical distribution it tells me that I have a 50% of getting a higher value on a single roll and 50% of getting a lower value.

It also tells me, and this is important, that I have a 0% chance of getting a value of exactly the mean. In my professional engineering work with discretized data systems, I have to make these kinds of distinctions on a regular basis.

Just because YOU can’t think of a use for this information, or come up with an example where it is not useful (like your silly warden example) doesn’t mean that it is useless in all cases.

What if there are an odd number of equally spaced probabilities (e.g. 1 to 7)? Then the mean “does exist”. By your logic, this case would be treated completely differently, which is of course ridiculous.

In the case of 6 equal probabilities, if I am given even odds on EXCEEDING the (“non-existent”) mean, that is a fair bet. In the case of 7, it would not be a fair bet, as I only have a 3-in-7 chance of exceeding the mean. So the “existence/non-existence” of the mean (as you erroneously express it) is important.

You ask me: “How many times have you converted a discretized system into a continuous one? What process did you use to come up with the continuous function describing a discrete set of values?”

It is in fact very common in this engineerinng field, when there are large numbers of discrete possibilities, to use the continuous approximation, and it can work very well. You obviously have not dealt with these types of systems.

Let’s go back to the case of a set of N=100 dice tosses each. Even though the underlying distribution is discrete (with a “non-existent” mean), the plotted results of the sum/average values from many such tosses will fit almost perfectly on a continuous Gaussian curve. This is obvious to anyone who has actually worked with data of this type.

I could make very similar points regarding the importance of the “non-existent” standard deviation, but it is not worth the time. But very quickly, for N=1 die toss, the SD is 1.708 as calculated above. For N=2, the SD is 1.21, For N=3, the SD is 0.98. This is important knowledge about probabilities. But you claim it is non-existent and “useless”.

I could go on and on, but there obviously is no point.

Tim Gorman

Reply to Ed Bo

September 26, 2019 2:48 pm

“The fact that the mean of a single fair die toss is 3.5 tells me many things. With a symmetrical distribution it tells me that I have a 50% of getting a higher value on a single roll and 50% of getting a lower value.”

So what? Are you going to bet your money on which will happen?

“It also tells me, and this is important, that I have a 0% chance of getting a value of exactly the mean. In my professional engineering work with discretized data systems, I have to make these kinds of distinctions on a regular basis.”

ROFL!! You are just repeating back what I’ve been trying to tell you for literally days!

“Just because YOU can’t think of a use for this information, or come up with an example where it is not useful (like your silly warden example) doesn’t mean that it is useless in all cases”

LOL! You tell me that you have to decide when the mean doesn’t exist in order to handle your discretized systems in the real world and then tell me that the mean is useful?

“What if there are an odd number of equally spaced probabilities (e.g. 1 to 7)? Then the mean “does exist”. By your logic, this case would be treated completely differently, which is of course ridiculous.”

And now you think you can roll a 7 on a single 7-sided die? You just keep getting further and further into fantasy! Show me a *fair* rolling 7-sided die and we can talk.

“6 equal probabilities,” “In the case of 7”

See what I mean?

“It is in fact very common in this engineerinng field, when there are large numbers of discrete possibilities, to use the continuous approximation, and it can work very well. You obviously have not dealt with these types of systems.”

Really? So you make something up? I want to see come up with an equation for the length of steel girders that is a continuous function and that is usable in the real world! It’s not just a matter of having a large number of discrete possibilities, those discrete possibilities have to be close enough, i.e. a small delta change from one to the other, to make the approximation usable. How do you do that for the probabilities associated with the roll of a die?

“the plotted results of the sum/average values from many such tosses will fit almost perfectly on a continuous Gaussian curve. This is obvious to anyone who has actually worked with data of this type.”

Really? How does the data fit on a Gaussian distribution with a big hole at the mean? That means your “continuous” function is actually discontinuous – meaning it can’t be a Gaussian. The probability for each roll is 1/6. That’s one probability for all the members of the set. No matter how you calculate the sums and averages they will all come out to be 1/6. [(100) * (1/6)]/100 always comes out to be 1/6. A single point can’t define a continuous function.

“But you claim it is non-existent and “useless”.

I could go on and on, but there obviously is no point.”

As I said in another post, there is no reason why a six sided cube has to be considered as a random number generator. If I mark each face of the cube with the given name of some of my friends then how do you calculate a mean/average, variance, or standard deviation when all that shows on the x-axis is proper names?You can still determine the probability that each name will show up during a number of test rolls. And the mean of the probabilities will be 1/6 and the variance and standard deviation of the probabilities will be zero.

You have yet to show any use for the non-existent mean other than you can calculate one.

Ed Bo

Reply to Ed Bo

September 26, 2019 10:11 pm

Tim:

You reach new lows with every post.

You say: “there is no reason why a six sided cube has to be considered as a random number generator.”

It does not have to be, but it CAN BE and OVERWHELMINGLY IS USED as a random number generator in countless games. And it was those cases that I was obviously referring to. Did you never play dice games as a kid where the numerical value mattered? Your argument reeks of last ditch desperation trying to salvage a losing case.

All my arguments have been referring completely obviously (for any remotely competent reader) to the numerical VALUES on the die, which are [1 2 3 4 5 6]. This numerical data set has a set of statistics associated with it.

Previously, you were at least discussing this data set and its statistical properties, even if we disagreed on the relevance of those properties. But now, you change your argument to the data set [1/6 1/6 1/6 1/6 1/6 1/6], which is a completely different beast. Another move of desperation?

I thought you were mathematically sophisticated enough to understand my case of 7 discrete equal probabilities (the data set [1 2 3 4 5 6 7]) without needing to visualize a specific physical manifestation. But I see I was wrong.

So I’ll give you a similar example you can understand. Consider an icosahedral (20-sided) die with 4 sides each having the values 1, 2, 3, 4, and 5. For a fair die, the expected value of the mean is 3. It EXISTS!!! Now, if I offered you even odds that you could roll the die to exceed the mean, would you take the bet? (And is your answer different from the case of a cubic die?)

You keep confusing yourself by stating that the mean “does not exist” in some data sets. This gets you in all sorts of trouble. You really need to state that the expected value of the mean is not an element of the set of possible individual values. That is a very different thing! And until you understand that distinction, you will continue to throw out ridiculous assertions.

I had stated that: “It is in fact very common in this engineerinng field, when there are large numbers of discrete possibilities, to use the continuous approximation, and it can work very well.” You objected, using your girder case of two discrete possibilities. Since when is two a large number? Do you realize how foolish you are making yourself?

In my last post, I stated that the distribution of the sums/averages of multiple sets of N=100 dice tosses came very close to fitting on a continuous Gaussian curve, even though it is fundamentally discrete. This is easily verifiable, both experimentally and theoretically.

Your objections to this true statement aren’t even consistent with each other. Your first objection at least concerns the numerical values of the tosses, but trips up on your erroneous concept of the “non-existence” of the mean. Your second objection shifts to the data set of probabilities of each value, not the values themselves. Please keep your arguments straight!

You asked: ” How does the data fit on a Gaussian distribution with a big hole at the mean?”

I thought before that you did not appreciate some of the subtleties of the Central Limit Theorem. Now I realize you do not understand the very essence of the CLT! The whole point of the CLT is that you end up with a Gaussian distribution regardless of the underlying distribution of individual samples.

You conclude by arguing: “You have yet to show any use for the non-existent mean other than you can calculate one.”

Bull! I already explained the distinction between “non-existent” and “existent” means (as you erroneously put it) in deciding what the odds are of exceeding the mean. And I have discussed several times the results of N>1 samples. The expected value of the mean of the sum of N=100 tosses is 350, 100 times the “non-existent” mean of 3.5. This is not difficult!

Jim Gorman

Reply to TonyL

September 22, 2019 12:00 pm

Part of the problem is that each temp used is an average already from one site. In order to average with other sites you must assume that the “samples” come from the same population. That isn’t the case. For example, San Diego may HAVE an average of 70 with a standard deviation of 1 degree. Tulsa may have an average of 80 and a standard dev of 15 deg. Hard to assume these are from the same population.

Paul

Reply to Jim Gorman

September 22, 2019 6:44 pm

If you use a micrometer to measure the thickness of of plates of polished steel but do not wipe the sand off the surfaces you have a lot of crap if measuring to .001 of an inch. This is what I think of the temperature readings used.

Tim Gorman

Reply to Paul

September 23, 2019 6:33 am

“If you use a micrometer to measure the thickness of of plates of polished steel but do not wipe the sand off the surfaces you have a lot of crap if measuring to .001 of an inch. This is what I think of the temperature readings used.”

That’s a very good observation. Especially if you consider the sand to be the “uncertainty” in the thickness!

co2isnotevil

September 21, 2019 9:17 am

The reason models run hot is because the fail to ‘close the loop’. GCM’s are ‘open loop’ models which means that there’s no closure between the input (the Sun) and the state, which is the surface temperature and its corresponding SB emissions. They attempt to close the loop between solar input and planet output, but that leaves the state (surface temperature) a free variable where ‘equilibrium’, that is when solar input equals emissions at TOA, can theoretically be achieved for any value of surface temperature by adjusting clouds, even though only one value of cloud coverage and surface temperature is valid.

The way to close to loop is to consider the energy stored by the planet the state variable, which I will call E, which is linearly proportional to the temperature. The relevant set of equations that close the loop are as follows:

Po(t) = Pi(t) + dE(t)/dt
T(t) = k1*E(t)
Po(t) = K2*o*T(t)^4

Pi is the instantaneous power arriving from the Sun (after reflection) and Po is the instantaneous power leaving the planet. Their instantaneous difference is either added to or subtracted from the energy stored by the planet, E, (the derivative of E in Joules/m^2 has the units of Watts/m^2). The constant k1, is the heat capacity of the surface and k2 is the relative ratio between Ps and Po, where Ps is the SB emissions of a surface at T and o is the SB constant. The constants k1 and k2 are readily established by the data. While k2 is a relatively constant 0.62 from pole to pole, owing to the differences between the hemispheres, the behavior of the hemispheres must be closed independently in order to be able to extract the relative k1’s from the data. When energy passed around the surface is also accounted for, COE dictates that this set of equations must be valid for all time, for all points on the surface and at TOA above those points.

co2isnotevil

Reply to co2isnotevil

September 21, 2019 9:28 am

Sorry, the first equation should be Pi(t) = Po(t) + dE(t)/dt

4kx3

Reply to co2isnotevil

September 21, 2019 11:16 am

co2isnotevil
I think you are overlooking the 80w/m^2 that goes to water vapor production. This heat gets stored in the vibration manifold of the molecule and is redistributed around the globe over great distances and with several days of delay.

co2isnotevil

Reply to 4kx3

September 21, 2019 12:00 pm

Not at all. I fully account for this, as well as all of the other possible sources of energy transported into the atmosphere from the surface by matter, as opposed to by photons. Even you’ve confirmed in your comment that latent heat just redistributes existing heat around the planet, to which as I said earlier that when you account for energy being moved around the planet, the equations remain valid across the entire globe and at all time. Note that latent heat is also the source of the energy driving the weather which is also returned to the surface. Since little of this energy is transported between hemispheres, modeling hemispheres by themselves fully accounts for the effects of the redistribution of energy by latent heat, convection and ocean currents.

Latent heat is energy transported by matter which is orthogonal to the RADIANT balance. If you carve out the return of latent heat and thermals from the ‘back radiation’ term, all that’s left are the W/m^2 offsetting the state, where the state is the NET emissions of the surface manifesting a temperature. Now, the question becomes what effect does latent heat, convection plus the return of this energy to the surface have on the state other than the effect it’s already having? Moreover; whatever effect it’s having on the the W/m^2 of actual solar forcing is uniformly applied to each and will be the same for the next W/m^2 of forcing.

4kx3

Reply to co2isnotevil

September 21, 2019 2:52 pm

co2isnotevil
You said “must be valid for all time, for all points on the surface and at TOA above those points”. It seems to me that if latent heat is moving around in the lower troposphere then the TOA point of up radiation can be far removed from the point of down radiation in both space and in time. Since the time delay prior to condensation is variable, then any energy balance involving condensation that did not include it will also be variable.

I would find it useful if you could distinguish between your E and classical enthalpy which includes latent heat not proportional to temperature.

co2isnotevil

Reply to co2isnotevil

September 21, 2019 5:26 pm

4kx3,

The point is that the balance equation, Pi(t) = Po(t) + dE(t)/dt, is an expression of COE and energy must always be conserved. As I said, account for other power in and out of a point on the surface and COE must still be observed. We just need to add a net input from other cells on the Pi side and add a net output to other cells on the Po side.

This is why I like averages for slices of latitude. We can ignore power transfers East and West as they cancel across the slice, while the power transfer North South is from the equator to the poles and is readily measurable with an identifiable seasonal signature per hemisphere. When averaged over a year, even N/S transfers cancel since the NET fraction of the energy budget that crosses hemispheres across a full year is insignificantly small.

E is the enthalpy. To the extent that other E transiently doesn’t contribute to the surface temperature, it can be considered a constant amount of E, whose derivative is zero, thus it has no influence on the balance.

Greg

Reply to co2isnotevil

September 21, 2019 10:07 am

the energy stored by the planet the state variable, which I will call E, which is linearly proportional to the temperature.

Sadly that is not true, which is why taking the “surface temperature” as the defining metric is bad physics. ( For those still pretending this is about science ).

The ratio of heat energy to temperature is called the heat capacity or the specific head capacity for a unit mass of material. This is a physical property of any particular material ( eg water, rock, dry air ) and thus not the same for the whole planet, in particular for land and sea where it varies by a factor of about two.

This is why it is not physically meaningful to use temperature as a proxy for energy content and to calculate the effects of radiation imbalance as you are suggesting and as the whole IPCC conspiracy would have us believe.

Energy is an extensive quantity , temperature is not. You can not use “average temperature” as a physically meaningful quantity. It is a statistic at a particular point in time, nothing more.

Barbara

Reply to Greg

September 21, 2019 11:35 am

Greg says, “This is a physical property of any particular material ( eg water, rock, dry air ) and thus not the same for the whole planet, in particular for land and sea where it varies by a factor of about two.”

To be fair, the temperature is also not the same for the whole planet. And, to be clear, I agree “average temperature” is a physically meaningless quantity. But, a better understanding of the energy content of the planet (including biosphere and atmosphere) would be more meaningful to understand how that changes (in different media) with any radiative forcing, would it not?

co2isnotevil

Reply to Greg

September 21, 2019 2:41 pm

Greg,

This is a best practices concept called EQUIVALENT modeling. Don’t get so hung up on what thermometers say. No thermometer can measure the average temperature of the planet by taking measurements in a few places, much less one place. All we can do is construct the best metric we can that’s representative of the average behavior of the planet and predict how it will change when the system or forcing changes. Certainly if the average surface emissions increases by 1 W/m^2, parts of the planet will actually decrease, others will increase by more and some may not change at all. So what. The average is still 1 W/m^2 which can be converted into a temperature change representative of the change in emissions by the whole.

Saying that averages are meaningless is no way to dispute the broken science of the consensus. GCM’s attempt to be equivalent models, they’re just not very good at it because they’re essentially running open loop. They’re right about the fact that changes in averages are representative of changes to the whole. They’re just horribly wrong about so much more, for example, how much they claim the average changes when CO2 doubles.

Tim Gorman

Reply to co2isnotevil

September 21, 2019 3:03 pm

“The average is still 1 W/m^2 which can be converted into a temperature change representative of the change in emissions by the whole.

Saying that averages are meaningless is no way to dispute the broken science of the consensus.”

The averages *are* meaningless, for both temperature and/or emissions. Unless you know what is happening at the edges, e.g max and min temperatures, you don’t really know what is driving the averages. If you don’t know what is driving the averages then of what use are they?

It’s like driving a heavily loaded truck in the mountains. Your average temp may be 180F but you don’t know if the radiator is boiling over on long uphill pulls or reaching 160F on long downhills. It’s the two edges that tell you how the truck is working, not the “average”.

co2isnotevil

Reply to Tim Gorman

September 21, 2019 6:56 pm

Tim,
The only thing establishing the average temperature is the average stored energy. The edges on either side of the average cancel each other out by the very definition of average.

While the averages are meaningless relative to their application to any specific point on the surface, they are none the less absolutely representative of the time averaged bulk behavior of the planet. Keep in mind that my balance equation represents points in time, while averages calculated from it are integrated across intervals of time.

Regarding your cooling system example, the limits, or in fact even the average, has nothing to do with whether or not the climate is working. The climate is not broken and is always working properly.

There is a whole lot wrong with how the IPCC frames the science, but considering that the behavior of a proper average can reflect the average behavior of the planet is not one of them. They just have no idea about what is a proper average.

Tim Gorman

Reply to co2isnotevil

September 21, 2019 9:30 pm

“The edges on either side of the average cancel each other out by the very definition of average.”

Huh? Reality *never* cancels out. If I give you two data sets, (2,4,6,8,10) avg=6 and (4,5,7,9,10) avg = 7 then how can you say the edges cancel out? The avg went up because the smallest numbers went up, not the largest number. This average is going up on the Earth but it is vital to know if it is because the smallest numbers (i.e. the minimum temps) are going up or if it is because the largest numbers (i.e. the maximum temps) are going up — or if it is a combination of the two. Only then can rational decisions be made about the effects of the average going up.

“they are none the less absolutely representative of the time averaged bulk behavior of the planet.”
So what? That time averaged bulk behaviour tells you absolutely *nothing* that allows a rational decision to be made about anything. If max temps are not going up but minimum temps are then we are probably headed into an era of abundance, not catastrophe!

co2isnotevil

Reply to Tim Gorman

September 22, 2019 11:55 am

Tim,

The limits average out to become the mean, i.e. they cancel relative to the mean. Samples above the mean can push the average up and those below can push it down, but given an unchanging system, there are just as many W/m^2 making parts of the system locally warmer as there is a deficit of W/m^2 making other parts of the system locally colder. All of the natural variability we observe is exactly like this, where warmer or colder relative to the mean spans both time and space. Most of what we observe is natural variability just like this. Whether or not a system uniformly warms or cools is irrelevant to the fact that it’s either warming or cooling. For all intents and purposes, it makes no difference whether a 0.1C increase was uniform across day and night or was the result of 0.2C of warming at night and none during the day. If it was a 5C difference, it would make more of a difference, but even 3C is well beyond what man will ever be able to accomplish with CO2 emissions.

Yes, GHG (and cloud) related warming affects colder temperature more than it affects warmer temperatures, but so does actual solar forcing. The T^4 relationship between temperature and W/m^2 quantifies why since 1 W/m^2 has a larger effect when T is smaller and the system conserves W/m^2, not degrees of temperature. When push comes to shove, even an alarmist scientists will not dispute this even as they also make the incompatible claim that approximately linearity around the mean is sufficient to apply linear analysis in the temperature domain, rather than apply linear analysis more naturally in the domain of W/m^2.

Focus on the real issue, which is that the IPCC supports their insanely high sensitivity with broken science which infers that CO2 affects the average far more than it possibly can. Disputing the use of averages as a means to quantify how the average behavior changes is not an winnable argument, nor is claiming that the average behavior is irrelevant to quantifying changes to the system.

Tim Gorman

Reply to co2isnotevil

September 22, 2019 4:30 pm

co2,

“The limits average out to become the mean, i.e. they cancel relative to the mean. Samples above the mean can push the average up and those below can push it down, but given an unchanging system, there are just as many W/m^2 making parts of the system locally warmer as there is a deficit of W/m^2 making other parts of the system locally colder.”

The limits do *not* cancel, they still contribute. Samples above the mean can also push the average down and samples below the mean can also push the mean up! You have no idea of what is happening from just looking at the average. Why did you not address the simple 5 member data sets I gave you to show what can happen to an average?

” For all intents and purposes, it makes no difference whether a 0.1C increase was uniform across day and night or was the result of 0.2C of warming at night and none during the day.”

Of course it make a difference! It’s the real world we are trying to explain and it makes a HUGE difference if you see 0.2C of warming at night and none during the day. It means longer growing seasons, more food, fewer heating degree days (i.e less energy expended in heating dwellings), etc. It means we could be headed into a period like the Garden of Eden instead of a world blackened by being burned into a cinder!

“Disputing the use of averages as a means to quantify how the average behavior changes is not an winnable argument,”

The use of averages is meaningless – PERIOD. Average behaviour is MEANINGLESS. The edges of the environment we live in determine the type of life we lead, not the average. Far fewer people move north during the winter than move south, assuming total freedom of choice of course. Far fewer people move toward the equator during the summer than move north.

Matthew Schilling

Reply to Tim Gorman

September 22, 2019 2:49 pm

I bring water to boil to make a cup of tea and set the cup on my stovetop to steep the tea bag. I then get caught up in a conversation and forget about my tea so that, by the time I remember it, the water in my cup is down to room temperature.

Won’t the temp of my water have dropped at a faster pace at its hottest temp? Didn’t the rate of cooling keep slowing as the water cooled?

Doesn’t a warming atmosphere provide a negative feedback against a warming atmosphere? Wouldn’t that negative feedback grow stronger as the air temp increases?

Tim Gorman

Reply to Matthew Schilling

September 22, 2019 4:33 pm

“Doesn’t a warming atmosphere provide a negative feedback against a warming atmosphere? Wouldn’t that negative feedback grow stronger as the air temp increases?”

And this applies to using average temperature being meaningless how?

co2isnotevil

Reply to Tim Gorman

September 22, 2019 6:21 pm

I said “cancel relative to the mean”. The point is that the natural variability around the mean averages away. Technically, on a W/m^2 by W/m^2 basis, the N W/m^2 making part of the surface warmer is arithmetically canceled by the other part of the system that’s N W/m^2 cooler. Positive N plus negative N is zero.

You’re hung up about semantics because you’re trying to make the case that warming limited to night is somehow better than uniform warming. The fact is, from the perspective of the physical realty, uniform change to the planet’s energy budget is measured in W/m^2, not degrees, so by the SB LAW, any uniform change, when measured in degrees, will appear larger at cooler temperatures. The same W/m^2 of equivalent forcing from atmospheric GHG’s is always present and always having an effect, it’s just a smaller effect when there are more W/m^2 at higher temperatures. The technical mathematical difference is that as a consequence of COE, superposition applies to W/m^2, but not to degrees.

Your claim that warming dominated by night is somehow better than uniform warming is moot, since from a temperature perspective, constant forcing, which is what a uniform change to the average represents, MUST have a larger temperature effect at night. Relative to the effects claimed by the IPCC, this difference is already accounted for. The problem is they claim a temperature increase that’s far too large to begin with. The average 3C claimed for doubling is about 3.1C during the day, dropping to about 2.9C at night, so the difference isn’t significant to begin with.

Again, I stand by my assertion that arguing against the relevance of averages when quantifying change to any causal system is an unwinnable argument.

Tim Gorman

Reply to co2isnotevil

September 23, 2019 6:30 am

“The point is that the natural variability around the mean averages away.”

Again, so what? Our lives are not determined by the average but by the edges.

“Your claim that warming dominated by night is somehow better than uniform warming is moot, since from a temperature perspective, constant forcing, which is what a uniform change to the average represents, MUST have a larger temperature effect at night.”

It’s not moot. And you just confirmed it by saying it requires higher temperatures at night! But higher temperatures at night won’t turn the Earth into a cinder as is being claimed by so many global political leaders. Moderating temperatures at night is a major benefit to humanity. It is this that should be trumpeted, not the scare that the Earth is burning up!

“The problem is they claim a temperature increase that’s far too large to begin with.”

I can only say that based on Dr. Franks analysis they can’t know for certain what their models actually show. They are cherry picking the worst case and treating it as the truth!

“Again, I stand by my assertion that arguing against the relevance of averages when quantifying change to any causal system is an unwinnable argument.”

Averages are meaningless in almost anything. The average length of 1000 steel girders is irrelevant when designing fish plates to connect them. You need to know the maximum and minimum lengths in order to design a plate that will handle the extremes. The average distance of the Earth from the Sun is irrelevant when calculating the Sun’s energy input to the Earth. You need to know the maximum and minimum distances in order to determine actual consequences to the Earth. The average distance between the Earth and Mars is irrelevant to designing a least cost path between the two. You need to know the minimum distance, not the average.

Any time you calculate an average and that average becomes your standard then you have lost valuable data that you need. You can argue all you want that the average is relevant but it isn’t, not in the real world where that lost, valuable edge data is what makes the biggest difference. As I pointed out to you with my minimal data sets with five members, which to have totally ignored, You simply can’t tell what is happening with the data by just averaging them. And it is the *data* itself that is the most important.

Matthew Schilling

Reply to Tim Gorman

September 22, 2019 6:38 pm

Because the “edges” don’t act like the average – I think a higher top temp provides a stronger negative feedback, therefore averages that arise from a bigger delta between high and low temps are different than the same average that arises from a smaller delta between high and low.

So, I thought I was agreeing with you, that average temps mean less than people want to say they mean.

Tim Gorman

Reply to Matthew Schilling

September 23, 2019 6:12 am

Matthew,

“So, I thought I was agreeing with you, that average temps mean less than people want to say they mean.”

Thanks for the explanation. What you were saying didn’t sink in through my thick skull. My problem, not yours.

co2isnotevil

Reply to Tim Gorman

September 23, 2019 10:07 am

Mathew, Tim,

Feedback is a red herring that can’t apply to a passive system like the climate. The implicit power supply providing the output Joules doesn’t exist, moreover; the relationship between W/m^2 and temperature is far from adhering to the requirement for STRICT linearity, that is, a strictly linear relationship between the input (W/m^2 of forcing) and the output (temperature) must be maintained for all possible W/m^2 of input forcing and all possible output temperatures. Assuming approximate lineariy around the mean is woefully insufficient for applying feedback analysis.

My point about averages is that W/m^2 intrinsically have a larger effect at colder temperatures and assuming that this effect isn’t already accounted for by the average is wrong. The fact is all warming and all cooling, independent of its cause, will affect colder temperatures more than warmer ones and this effect is already accounted for by changes in the average.

Your confusion arises because the IPCC wants you to be confused by improperly considering temperature to be the LINEAR output of the model, while the proper output is W/m^2. When the modeled output is properly considered as W/m^2, there’s no temperature dependence on the result and surface emissions changes day/night/winter/summer are the same. What you think is a problem is the result of a fundamental error where you’re linearly averaging temperature, rather than linearly averaging W/m^2 of emissions and converting the result to an EQUIVALENT average temperature.

Once again I will assert that ignoring the applicability of averages relative to quantifying changes in the average behavior is an unwinnable argument. You should focus on the many real problems instead. All the objection about the applicability of averages does is help alarmists by reinforcing the denial meme.

Nick Stokes

Reply to Greg

September 21, 2019 4:11 pm

“This is why it is not physically meaningful to use temperature as a proxy for energy content”
It isn’t used as a proxy for energy content. It is temperature. The difference between a cold day and a hot day.

Temperature has two roles. It is indeed, multiplied by specific heat (and integrated over space) a measure of energy content. But it is also a potential that governs heat flux (Fourier’s law). If you burn your finger on a soldering iron, it isn’t because of the heat content of the iron, which would be much less than a hot water bottle. It is the potential which causes an undue flux of heat to flow into your finger, raising the temperature and frying proteins etc.

co2isnotevil

Reply to Nick Stokes

September 21, 2019 7:06 pm

Yes, the temperature is linearly proportional to stored energy, but considering causality, the temperature is a manifestation of the stored energy and not the other way around The flux is proportional to the temperature raised to the fourth power and is the causal result of the temperature. It’s this last connection that the IPCC completely ignores in their quantification of the sensitivity by improperly assuming that approximate linearity around the average is sufficient when strict linearity is required for the feedback analysis being cited. In effect, they’re incorrectly turning T^4 into T.

Matthew R Marler

September 21, 2019 9:23 am

Kevin Kilty, thank you for your essay.

Bob Smith

September 21, 2019 9:25 am

Kevin, great post covering high level view of SPC. These concepts are valid across all fields working with products in the real world. I’ve seen them applied successfully in cell phone systems, military weapon systems, and satellite systems. When they are ignored, failure is often the result. Thanks,

W H Smith

September 21, 2019 9:26 am

To paraphrase Churchill:
Never in history have so few (climate modelers) cost so many (all humanity) so much (wasted multiple $TRILLIONS and now, deaths in the 10s of thousands through energy poverty (estimate 40,000 in the UK in 2017).
Many governments now prefer climate models to data (https://torontosun.com/opinion/columnists/goldstein-feds-scrapped-100-years-of-data-on-climate-change). Canada has joined Australia, certain USA agencies, along with UK and American universities in deleting, fraudulently altering, or simply ignoring measurements and using MODELS as the “truth”. Models are now referred to as “experiments” with “errors”. Models always yield the desired answer. Science is uncertain.
One is reminded of the line in “A Man for All Seasons”, where Margaret says” But, he (Henry VIII) has had his answer”, and Norfolk replies: “He wants another”. And in Chicago politics: “Keep counting until you win.”
Now, only models and political control matter. Science is neither required nor desired.

Clyde Spencer

September 21, 2019 9:39 am

Kevin Kilty
You offered as a possibility:

(3) The models and observations are of slightly different things. The observations mix unrelated things together, or contain corrections and processing not duplicated in the models.

As has been discussed here on WUWT, the climate data are actually averages of daily mid-range temperatures [(Tmax + Tmin)/2], not true arithmetic means of daily temperature time series. Whereas, I’m presuming that the models are forecasting true means. The difference may not be huge, but may be a contribution to the difference between observed and forecast temperatures.

Michael Jankowski

Reply to Clyde Spencer

September 21, 2019 10:06 am

The models are/have been tuned to match historical data. If that historical data is daily mid-range temperatures, then forecasts should be along those lines as well.

Besides, they’re dealing with anomalies, not raw temps. Would the trend in daily mid-range temps diverge from the trend in arithmetic means with respect to the base period by around 0.2 deg C that quickly?

Clyde Spencer

Reply to Michael Jankowski

September 21, 2019 11:42 am

Michael Jankowski

To calculate the anomalies, it is first necessary to create a 30-year average of mid-range temperatures.

Mid-range values are not as well-behaved statistically as means are. Tmax and Tmin are not changing in parallel. Tmin usually, but not always, has been growing more rapidly than Tmax.

John Shotsky

September 21, 2019 9:53 am

Just run the models backwards to see that we were in an ice age in the 1930’s…Co2 has nothing to do with climate. More importantly, 95% of Co2 in the atmosphere is emitted and absorbed by earth itself – humans are responsible for under 5%. The claim is that the 5% is enough to end life as we know it. How can people even accept such nonsense?
And of course, carbon is black, carbon dioxide is a clear gas. We aren’t removing ‘carbon’ from the atmosphere, they are calling carbon dioxide ‘carbon’. Ludicrous to call a gas a solid and treat it as such.

Greg Goodman

September 21, 2019 10:23 am

(1) The models do run too hot. They overestimate warming from increasing CO2, possibly because of a flawed parameterization of clouds or some other factor.

… or volcanoes .

One of the “parameters” which is used as a fudge factor is the scaling to measured AOD ( atmospheric optical density ) to radiative forcing in W/m^2 . Hansen 2002 reduced this to 19 W/m^2 from earlier “basic physics” modelled values or 30 W/m^2 . They openly state this was done as part of a larger effort of “reconcile model output” with the climate record.

This effectively makes the model overall has to be made more sensitive to volcanic forcing ( same effect in the model output from a lesser forcing ). This allows the model to show emergent properties : like the sensitivity to CO2 forcing which are also higher.

While both are present ( like the 1960-1990 calibration period ) this works as was as some other combinations of fudge factors. However, when one of the forcings is absent ( not major volcanic AOD since about 1995 ) it all falls apart and only the over sensitivity to CO2 remains.

Result : the models run hot.

John

September 21, 2019 11:41 am

A somewhat naive question, but is there any scope in using an analogue computer to model climate? Or a hybrid digital/analogue system?

michel

September 21, 2019 11:53 am

Kevin, as you are generously participating a lot in this thread, could I ask you a question which has always puzzled me about the spaghetti graphs and the model ensembles?

Why is it reasonable to take a bunch of models, some of which are clearly failing, and average the results to get a prediction? Surely if this were process control or medicine one would throw out the ones that don’t work and only use the ones that do not ‘run hot’?

What is the justification if any for this procedure? Ie, why is your black line showing the average of any interest?

Isn’t it a bit like I ask a bunch of people to estimate something, like numbers in a crowd or distance or weight in kilos. Some estimate consistently high, some consistently low. Well the sensible thing to do if estimating an unknown weight or distance or whatever is surely only to use estimates by people who usually get it pretty much right?

Am I missing something obvious?

1 2 Next »

wpDiscuz

Share this:

Related Posts

The Model That Works

Unverified and Unvalidated

Are Climate Models “Just Physics”?

SSP5-8.5: Garbage In, Doomcasting Out