The Laws of Averages: Part 2, A Beam of Darkness

Guest Essay by Kip Hansen

 

beam_of_darknessThis essay is second in a series of essays about Averages — their use and misuse.  My interest is in the logical and scientific errors, the informational errors, that can result from what I have playfully coined “The Laws of Averages”.

Averages

As both the word and the concept “average” are subject to a great deal of confusion and misunderstanding in the general public and both word and concept have seen an overwhelming amount of “loose usage” even in scientific circles, not excluding peer-reviewed journal articles and scientific press releases,  I gave a refresher on Averages in Part 1 of this series.  If your maths or science background is near the great American average, I suggest you take a quick look at the primer in Part 1 before reading here.

A Beam of Darkness Into the Light

The purpose of presenting different views of any data set — any collection of information or measurements about a thing, a class of things, or a physical phenomenon — is to allow us to see that information from different intellectual and scientific angles — to give us better insight into the subject of our studies, hopefully leading to a better understanding.

Modern statistical [software] packages allow even high school students to perform sophisticated statistical tests of data sets and to manipulate and view the data in myriad ways.  In a broad general sense, the availability of these software packages now allows students and researchers to make [often unfounded] claims for their data  by using statistical methods to arrive at numerical results — all without understanding either the methods or the  true significance or meaning  of the results.  I learned this by judging High School Science Fairs and later reading the claims made in many peer-reviewed journals.  One of the currently hotly discussed controversies is the prevalence of using “P-values” to prove that trivial results are somehow significant because “that’s what P-values less than 0.05 do”.  At the High School Science Fair, students were including ANOVA test results about their data –none of them could explain what ANOVA was or how it applied to their experiments.

Modern graphics tools allow all sorts of graphical methods of displaying numbers and their relationships.   The US Census Bureau has a whole section of visualizations and visualization tools. An online commercial service, Plotly,  can create a very impressive array of visualizations of your data in seconds.  They have a level of free service that has been more than adequate for almost all of my uses [and a truly incredible collection of possibilities for businesses and professionals at a rate of about a dollar a day].  RAWGraphs has a similar free service.

The complex computer programs used to create metrics like Global Average Land and Sea Temperature or Global Average Sea Level are believed by their creators and promoters to actually produce a single-number answer, an average, accurate to hundredths or thousandths of a degree or fractional millimeters.  Or, if not actual quantitatively accurate values,  at least accurate anomalies or valid trends are claimed.  Opinions vary wildly on the value, validity, accuracy and precision of these global averages.

Averages are just one of a vast array of different ways to look at the values in a data set.  As I have shown in the primer on averages, there are three primary types of averages  — Mean, Median, and Mode — as well as a number of more exotic types.

In Part 1 of this series, I explained the pitfalls of averages of heterogeneous, incommensurable objects or data about objects.  Such attempts end up with Fruit Salad, an average of Apples-and-Oranges:  illogical or unscientific results, with meanings that are illusive, imaginary, or so narrow as not to be very useful.  Such averages are often imbued by their creators with significance — meaning — that they do not have.

As the purpose of looking at data in different ways — such as looking at a Mean, a Median, or a Mode of the numerical data set — is to lead to a better understanding, it is important to understand what actually happens when numerical results are averaged and in what ways they lead to improved understanding and in what ways they lead to reduced understanding.

A Simple Example:

Let’s consider the height of the boys in Mrs. Larsen’s hypothetical 6th Grade class at an all boys school.  We want to know their heights in order to place a horizontal chin-up bar between two strong upright beams for them to exercise on (or as mild constructive punishment — “Jonny — ten chin-ups, if you please!”).  The boys should be able to reach it easily by jumping up a bit so that when hanging by their hands their feet don’t touch the floor.

The Nurse’s Office supplies the heights of the boys, which are averaged to get the arithmetical mean of 65 inches.

Using the generally accepted body part ratios we do quick math to approximate the needed bar height in inches:

Height/2.3 = Arm length (shoulder to fingertips)

65/2.3 = 28 (approximate arm length)

65 + 28 = 93 inches = 7.75 feet or 236 cm

 Our calculated bar height fits nicely in a classroom with 8.5 foot ceilings, so we are good.   Or are we?  Do we have enough information from our calculation of the Mean Height?

Let’s check by looking at a bar graph of all the heights of all the boys:

boys

This visualization, like our calculated average, gives us another way to look at the information, the data on the heights of boys in the class.  Realizing that because the boys range from just five feet tall (60 inches) all the way to almost 6 feet (71 inches) we will not be able to make one bar height that is ideal for all.  However, we see now that 82% of the boys are within 3 inches either way of the Mean Height and our calculated bar height will do fine for them.  The 3 shortest boys may need a little step to stand on to reach the bar, and the 5 tallest boys may have to bend their legs a bit to do chin ups.  So we are good to go.

But when they tried the same approach in Mr. Jones’ class, they had a problem.

There are 66 boys in this class and their Average Height (mean) is also 65 inches, but the heights had a different distribution:

boys_2

Mr. Jones’ class has a different ethnic mix which results in an uneven distribution, much less centered around the mean.  Using the same Mean +/- 3 inches (light blue) used in our previous example, we capture only 60% of the boys instead of 82%.  In Mr. Jones class,  26 of the 66 boys would not find the horizontal bar set at 93 inches convenient.  For this class, the solution was a variable height bar with two settings:  one for the boys 60-65 inches tall (32 boys), one for the boys 66-72 inches tall (34 boys).

For Mr. Jones’ class, the average height, the Mean Height, did not serve to illuminate the information about boys’ height to allow us to have a better understanding.   We needed a closer look at the information to see our way through to the better solution.  The variable height bar works well for Mrs. Larsen’s class as well, with the lower setting good for 25 boys and the higher setting good for 21 boys.

Combining the data from both classes gives us this chart:
boys_3

This little example is meant to illustrate that while averages, like our Mean Height, serve well in some circumstances, they do not do so in others.

In Mr. Jones’ class, the larger number of shorter boys was obscured, hidden, covered-up, averaged-out by relying on the Mean Height to inform us of the best solutions for the horizontal chin-up bar.

It is worth noting that Mrs. Larsen’s class, shown in the first bar chart above, has a distribution of heights that more closely mirrors what is called a Normal Distribution, a graph of which looks like this:

normal_distribution

Most of the values are creating a hump in the middle and falling off evenly, more or less, in both directions.    Averages are good estimations of data sets that look like this if one is careful to use a range on either side of the Mean.    Means are not so good for data sets like Mr. Jones’ class, or for the combination of the two classes.  Note that the Arithmetical Mean is exactly the same for all three data sets of height of boys  — the two classes and the combined — but the distributions are quite different and lead to different conclusions.

US Median Household Income

A very common measure of economic well-being in the United States is the US Census Bureau’s annual US Median Household Income.

First note that it is given as a MEDIAN — which means that there should be an equal number of families above this income as families below this income level.  Here is the chart that the political party currently in power — regardless of whether it is the Democrats or the Republicans — with both the Oval Office (US President)  and both houses of Congress in their pocket, will trot out:

The-Good-News

That’s the Good News! graph.   Median Family Income on a nice steady rise through the years, we’re all singing along with the Fab Four “I’ve got to admit it’s getting better, A little better all the time…

This next graph is the Not So Good News graph:

not_so_good_news

The time axis is shortened to 1985 to 2015, but we see that families have not been gaining much, if at all, in Real Dollars, adjusted for inflation, since about 1998.

And then there is the Reality graph:

MHI_49Yrs

Despite the Good News! appeal of the first graph, and the so-so news of the second, we see that if we dig below the surface, looking  at more than just the single-numeral Median Household Income by year, we see a different story — a story obscured by both the Good News and the Not So Good News.  This graph is MEAN Household Income of the five quintiles of income, plus the Top 5%,  so the numbers are a bit different and it tells a different story.

Breaking the population into five parts (quintiles), the five brightly colored lines, the bottom-earning 60% of families, the green, brown and red lines,  have made virtually no real improvement in real dollars since 1967.  The second quintile,  the middle/upper middle classes in purple, have seen a moderate increase.  Only the top 20% of families (blue line) have made solid  steady improvement — and when we break out the Top 5%, the dashed black line, we see that not only do they earn the lion’s share of the dollars, but  they have benefited from the lion’s share of the percentage gains.

Where are the benefits felt?

MHI_National

Above is what the national average, the US Median Household Income metric, tells us.  Looking a bit closer we see:

MHI_by_State

Besides some surprises, like Minnesota and North Dakota, it is what we might suspect.  The NE US: NY, Massachusetts, Connecticut, NJ, Maryland, Virginia, Delaware — all coming in at the highest levels, along with California, Washington.  Utah has always had the more affluent Latter-Day Saints and along with Wyoming and Colorado has become a retirement destination for the wealthy.  The states whose abbreviations are circled have state averages very near the national median.

Let’s zoom in:

MHI_by_County

The darker green counties have the highest Median Household Incomes.  It is easy to see San Francisco/Silicon Valley in the west and the Washington DC-to-NYC- to-Boston megapolis in the east.

This map answered my big question:  How does North Dakota have such a high Median Income?  Answer:  It is one area, circled and marked “?”, centered by Williams County, with Williston as the main city.  The area has less than 10,000 families.  And “Williston sits atop the Bakken formation, which by the end of 2012 was predicted to be producing more oil than any other site in the United States,” it  is the site of Americas latest oil boom.

Where is the big money?  Mostly in the big cities:

MHI_by_County_Cities

And where is it not?  All those light yellow counties  are areas in which many to most of the families live at or below the federal Poverty Line for families of four.

Income_on_the_Res

An overlay of US Indian reservations reveals that they are, in the west particularly, in the lowest and second lowest income brackets. (An interest of mine, as my father and his 10 brothers and sisters were born on the Pine Ridge in southwestern South Dakota, the red oval.)   One finds much of the old south in the lowest bracket (light yellow), and the deserts of New Mexico and West Texas and the hills of West Virginia and Kentucky.

One more graphic:

Household_Income_distrib_pe

What does this tell us?

It tells us that looking at the National Median Household Income, as a single-number–especially in dollars unadjusted for inflation–presents a picture that obscures, hides, whitewashes over the inequalities and disparities that are the important facts of this metric.   The single number, National Average  (Median) Household Income number tells us only that one very narrow bit of information — it does not tell us how American families are doing income-wise.  It does not inform us of the economic well-being of American families  — rather it hides the true state of affairs.

Thus, I say that the publicly offered Average Household Income, rather than shedding light on the economic well-being of American families, literally shines a Beam of Darkness that hides the real significant data about the income of America’s households.   If we allow ourselves to be blinded by the Beam of Darkness that these sort of truth-hiding averages represent, then we are failing in our duty as critical thinkers.

Does this all mean that averages are bad?

No, of course not.  They are just one way of looking at a batch of numerical data.  The are not, however, always the best way.  In fact, unless the data one is considering is very nearly normally distributed and changes are caused by known and understood  mechanisms, averages of all kinds more often lead us astray and obscure the data we should really be looking at.   Averages are a lazy man’s shortcut and seldom lead to a better understanding.

The major logical and cognitive fault is allowing one’s understanding to be swayed, one’s mind to be made up, by looking at just this one very narrow view of the data — one absolutely must recognize that the view offered by any type of average is hiding and obscuring all the other information available, and may not be truly representative of the overall, big picture.

Many better methods of data analysis exist, like the simplistic bar chart used in the school boys’ example above.  For simple numerical data sets, charts and graphs, if used to reveal (instead of hide) information are often appropriate.

Like averages, visualizations of data sets can be used for good or ill  — the propaganda uses of data visualizations, which now include PowerPoints and videos, are legion.

Beware of those wielding averages like clubs or truncheons to form public opinion.

And climate?

The very definition of climate is that it is an average — “the weather conditions prevailing in an area in general or over a long period.”  There is no single “climate metric” — no single metric that tells us what “climate” is doing.

By this definition above, pulled at random from the internet via Google, there is no Earth Climate — climate is always “the weather conditions prevailing in an area in general or over a long period of time”.   The Earth is not a climatic area or climate region, the Earth has climate regions but is not one itself.

As discussed in Part 1 — the objects in sets to be averaged must be homogeneous and not so heterogeneous as to be incommensurable.  Thus, when discussing the climate of a four-season region, generalities are made about the seasons to represent the climatic conditions in that region during the summer, winter, spring and fall, separately.  A single average daytime temperature is not a useful piece of information to summertime tourists if the average is taken for the whole year including the winter days — such an average temperature is foolishness from a pragmatic point of view.

Is it also foolishness from a Climate Science point of view?  This topic will be covered in Part 3 of this series.   I’ll read your comments below — let me know what you think.

Bottom Line:

It is not enough to correctly mathematically calculate the average of a data set.

It is not enough to be able to defend the methods your Team uses to calculate the [more-often-abused-than-not] Global Averages of data sets.

Even if these averages are of homogeneous data and objects, physically and logically correct, averages return a single number and can incorrectly be assumed to be a summary or fair representation of the whole set.

Averages, in any and all cases, by their very nature, give only a very narrow view of the information in a data set — and if accepted as representational of the whole, will act as a Beam of Darkness, hiding  and obscuring the bulk of the information;   thus,  instead of leading us to a better understanding,  they can act to reduce our understanding of the subject under study.

Averages are good tools but, like hammers or saws, must be used correctly to produce beneficial and useful results. The misuse of averages reduces rather than betters understanding.

# # # # #

Author’s Comment Policy:

I am always anxious to read your ideas, opinions, and to answer your questions about the subject of the essay, which in this case is Averages, their uses and misuses.

As regular visitors know, I do not respond to Climate Warrior comments from either side of the Great Climate Divide — feel free to leave your mandatory talking points but do not expect a response from me.

I am not an economist — nor a national policy buff – nor interested in US Two-Party-Politics squabbles.  Please keep your comments to me to the question of the uses of averages rather than the details of the topics used as examples.  I actually have had experience in building exercise equipment for a Youth Camp.

I am interested in examples of the misuse of averages, the proper use of averages, and I expect that many of you will have varying opinions regarding the use of averages in Climate Science.

# # # # #

Advertisements

245 thoughts on “The Laws of Averages: Part 2, A Beam of Darkness

  1. Having two legs I am slightly above the average number . No one has three but many have one ,some have non ??? Am I right ?

  2. “he complex computer programs used to create metrics like Global Average Land and Sea Temperature or Global Average Sea Level are believed by their creators and promoters to actually produce a single-number answer, an average, accurate to hundredths or thousandths of a degree ”

    Wrong. The observational data is used to create a spatial prediction of the unmeasured locations. This prediction is referred to as an average. Really a misnomer. The precision of the prediction does not state our knowledge about the observations but rather minimizes the error of prediction.

    Operationally it means this.
    When we say the global average is 15.567 that mean…
    If you select a random unmeasured location…and do this
    Say 1000 times that the prediction 15.567 will give you a minimum error..It will produce less error than say 15

    The observations are not averaged in an any simple sense of the word. I used to think that. Then I read some code.
    Then I studied the theory of spatial prediction.

    You might try that.

    Finally I have explained this many times kip.

    Ignoring those explanations means you cannot and do not engage in critical or skeptical thinking. You fool yourself.

    • Steven Mosher ==> Are you not the Steven Mosher who is a co-author of the paper titled:
      “Berkeley Earth Temperature Averaging Process” ?
      Is your paper mis-titled for some reason?
      Are you not a member of the Berkeley Earth team whose website publishes these graphs (among many other similarly labelled graphs):

      Are you telling me that you actually do something entirely different and simply falsely label the graphs as Average Temperatures?
      If your (BEST) results are not average temperatures, or temperature averages, why do you label them so and compare them to other temperature averages?
      Maybe you can explain why you compare BEST’s “spatial prediction[s] of [the temperatures at] unmeasured locations” to the Global Temperature averages from other groups.

      • Kip: If you read just the abstract of the Best paper you linked, you would realize the BEST’s results are not simplistic averages. Thousands of critical words that are irrelevant to what climate scientists are actually doing doesn’t inform readers, it misleads them.

        A new mathematical framework is presented for producing maps
        and large-scale averages of temperature changes from weather
        station thermometer data for the purposes of climate analysis. The
        method allows inclusion of short and discontinuous temperature
        records, so nearly all digitally archived thermometer data can be
        used. The framework uses the statistical method known as Kriging
        to interpolate data from stations to arbitrary locations on the Earth.
        Iterative weighting is used to reduce the influence of statistical
        outliers. Statistical uncertainties are calculated by subdividing
        the data and comparing the results from statistically independent
        subsamples using the Jackknife method. Spatial uncertainties
        from periods with sparse geographical sampling are estimated by
        calculating the error made when we analyze post-1960 data using
        similarly sparse spatial sampling. Rather than “homogenize” the
        raw data, an automated procedure identifies discontinuities in the
        data; the data are then broken into two parts at those times, and the
        parts treated as separate records. We apply this new framework to
        the Global Historical Climatology Network (GHCN) monthly land
        temperature dataset, and obtain a new global land temperature
        reconstruction from 1800 to the present. In so doing, we find results
        in close agreement with prior estimates made by the groups at
        NOAA, NASA, and at the Hadley Center/Climate Research Unit in
        the UK. We find that the global land mean temperature increased
        by 0.89 ± 0.06°C in the difference of the Jan 2000-Dec 2009
        average from the Jan 1950-Dec 1959 average (95% confidence
        for statistical and spatial uncertainties).

        You could criticize BEST for splitting records at discontinuities. When discontinuities are caused by correction of a slowly accumulating bias, then splitting the record eliminates the needed correction and keeps the biased trend. When a station is moved from a gradually urbanizing site to a nearby park, homogenization and splitting introduce bias. If routine maintenance creates a breakpoint, bias in introduced.

      • Frank ==> I don’t need to read the Abstract, I’ve read the whole paper. Perhaps, then, you’ll do me the favor or reading my whole essay – both Part 1 and this one. I’ve wasted very few words on Climate Science, as it is truly a waste to do so, no one reads past the topic. I have certainly not accused them of making simplistic averages….
        There is only this bit “The complex computer programs used to create metrics like Global Average Land and Sea Temperature or Global Average Sea Level are believed by their creators and promoters to actually produce a single-number answer, an average, accurate to hundredths or thousandths of a degree or fractional millimeters. ” which applies to findings of Mean Sea Level as well. The truth of that simple statement is in the BEST graphs above — their products and their press releases.
        I’m sorry if you have had your feeling hurt by something I’ve said here — if you had used your own real name and affiliation when commenting, I would probably understand why you are attacking a simple elementary level discussion of what happens to our understanding of complex topics when we are presented with single-number Averages instead of more meaningful, explanatory views.

      • Kip: Thanks for the reply. No, my feelings weren’t hurt by anything you’ve said. If my words were, intemperate or not carefully considered, I have no right to take out my frustrations on others.

        I’ve read extensively on both sides of the statistical analysis issue of temperature record and can’t see much that is being done incorrectly. Yes, the public is being misinformed about “record-breaking temperatures” that are a few hundredths of a degree above the previous record. Most technical people understand the difference between mean, median, mode and the full spread of data, but some in the public don’t. (My son’s school wasted far too much time in math classes on this subject, so I don’t know how many people are really poorly informed – or just too lazy to think about what is hiding behind the word average or other statistical tricks.) I wish there were a way to produce an index that would weight the importance of temperature change: warming is often good in the winter and at night, warming in the northern polar region isn’t important to many humans – except to the extent it causes Greenland to melt (and Greenland survived several millennia of the Holocene Climate Optimum). I can’t come up with a defensible scheme. I explained in another comment why there is only a1% difference between (average T)^4 and average (T^4). Surveying stations for bias (like UHI and poor siting) today can’t tell us if published trends are wrong. Only CHANGING bias can produce a faulty trend, and we have little ability to quantify changing bias. I honestly can’t find much that is seriously wrong with with the temperature record except homogenization and BEST’s equivalent, splitting records. That is worth about 0.2 K of warming and some of those corrections are appropriate (TOB). I mentioned that above.

        I don’t comment using my full name because, I don’t want to be mistaken for a den**r or associated with scientific fantasies. For example, I find ludicrous Lord Monckton and those who assert that lack of a statistically significant warming trend is proof of absence of warming. My journey through global warming blogs began with outrage at RC’s acceptance of errors in AIT and deletion of comments. They seemed to disdain someone named McIntyre, so (being somewhat contrarian in nature) I wanted to know what he had to say. When I carefully compared McIntyre to RC and other defenders of the consensus, I usually found McIntyre more credible (but WUWT often less credible). McIntyre recommended scienceofdoom.com, a wonderful resource for those who care about accurate physics without unnecessary forays into speculative science or policy. (After reading that blog from its beginning, I can’t tell what its host really believes about CAGW, but his very earliest posts give a hint. Many skeptics comment there, but den**l isn’t practical if you start with generally-accepted physics, a ground-rule for that blog). For the most part, I usually end up with a lukewarmer position. I’d love to read a scientific post demonstrating that ECS could be below 1 K/doubling, but none have survived scrutiny. Judy Curry posted an article of mine about the 97% consensus under my full name. If my anonymity under these circumstance offends you, it is certainly your right to not take me seriously.

        Steve Mosher’s journey from skeptic at ClimateAudit and author of a book on Climategate to critic of many skeptics is an interesting phenomena. I respect his science, but not is apparent conclusion that the consensus must now be right because their temperature record appears to be right. There are plenty of other good reasons to doubt that CAGW is coming besides the temperature record. See Nic Lewis.)

      • Frank ==> I will put you in my mental list of like-minded people. In the Climate Wars one has to ignore the screeching rantings and battle cries of the Climate Warriors — from both sides of the divide — and just try to filter out the idiocy (including that published in peer-reviewed journals), absorb that which is supported by real science (strained through very skilled critical thinking) and if so inclined, try to communicate the true bits to others in a way they can understand.
        Stokes complained here that I wrote such an elementary essay on so basic a point — but fails to see that this is the level that 90+% of the readers here need. At Judith’s, where some of my writing appears, the audience is different and I have a different approach.
        I will respect you need for privacy. Personally, I am past that — well retired and no one can threaten or molest. Besides, I have always been very willing to take the heat for my personal opinions — perhaps too much so according to some — but this is not everyone’s way.
        Good luck to you — Anthony is on vacation and WUWT could use some more good solid sensible writing.

    • Mosher [snip . . . pointless rants don’t add to what we know. Try harder to teach or learn . . . mod]

    • In shooting you take into account two factors: precision and accuracy. Precision indicates high consistency of aim. The shots cluster together on the target, even though they may fall well away from the intended point of aim. Precision reflects the shooter’s steadiness of grip and quality of eyesight.

      Accuracy is a different thing. It is a measure of how close to the actual point of aim a round falls. In the real world, accuracy is far more important than precision. That is because an accurate rifleman has a higher likelihood of making a “useful” shot, than a precise shooter whose weapon is not properly cited in, or who cannot “accurately” judge the range and effects of weather on shooting. Ten precise but inaccurately placed rounds are a waste.

      So respecting climate models the real question is do models accurately bracket real conditions, or do they cluster precisely with a bias in the aim. Every plot of climate models I have seen shows that they are precise but biased rather than accurate, if somewhat scattered. It isn’t the precision that is the problem but rather the lack of accuracy. The bias indicates poor “sight adjustment.”

      • You’re confusing precision with repeatability. Repeatability is the indication of consistency (how close the shots cluster). Precision is how thin are the circles on the target you are using to measure your accuracy and repeatability (or what is the subdivision on the tape measure you use).

        I can be very precisely very wrong: “I missed the target by 1.00003401 miles”
        I can be very in-precise in my claim of accuracy: “I was within 0 light years of the target”
        I can have a precise measurement with high inaccuracy: “3.312 +/- 2.500 mm”

        and significant figures are not the same thing as precision (even if they are sometimes used to approximate it): the decimal number 30 has 1 significant digit. the binary number 11110 has 4 significant digits. They both indicate the exact same quantity but if I were to use the significant figures as an indication of precision I get 10 (decimal) in one case and 2 (decimal) in the other. I just don’t know what the precision is unless I have many numbers from the same instrument to compare (or the specs of the instrument). For example if I see a measurement of 3.125″, is my precision 1/1000th of an inch or 1/8th or maybe even 1/16th or 1/32nd? I don’t know unless I look at the measuring device (tape measure or calipers?) or look at enough other measurements to infer it.

      • David S,
        You are obviously not familiar with the shooting sports. In addition though, your post highlights the need for a precise definition of what is being measured and what how the measurements are used.

      • David, significant figures aren’t universal precision. However, they are used to show precision if there are no error given. The presumption of 5.2 means that it is between 5.15 and 5.25 if there is nothing else given.

        If you have a different estimate of error and do not give it, then you are being deceptive.

        Finally, repeatability and precision are normally used as synonyms, even in the law (40 CFR 60 Appendix B uses it this way). You are mentioning only precision of speech, not precision of measurement.

    • I have bought a lottery ticket with the same numbers for 2 years! Why you may ask? Ah, the law of averages!

    • If that is true, then your methodology is faulty. Averaging for precision only works if you are measuring the same thing many times, not different things many times. Temperature can only be measured once, because it only occurs once. You cannot measure the temperature of a second again ever again. Of course you can measure it at that instant many times, but that’s not what you are claiming you do.

      Temperature changes in the fourth dimension, unlike say a strip of metal.

      And of course you are not measuring but extrapolating, because it is, as you say “unmeasured”, then you are not measuring at all anyway.

      So how can you claim that measuring something that is not measured is is measured?

    • To Steven Mosher: In reference to “Operationally it means this.
      When we say the global average is 15.567 that mean…
      If you select a random unmeasured location…and do this
      Say 1000 times that the prediction 15.567 will give you a minimum error..It will produce less error than say 15”

      So help me understand the process here. The code is designed, through using the theory of spacial prediction, to minimize if not entirely eliminate the error associated with the prediction. BUT, the prediction, theoretically, could deviate substantially from reality, yes? Is there a link between the precision of the predicted value vs. the precision to predict the temperature in the unmeasured location? Have we taken the time (and, obviously, money) to test these predictions by measuring some of these locations?

      In other words …..

      15.567 +/- 0.002, (wow, really precise prediction, let’s go to Vegas)
      15.1 +/- 0.1 (damn thermometers, if only we could make them better)

      But the precision in the prediction is poor, yes? The Falcons still lost the Super Bowl …….

      Am I reading this correctly? If not, please clarify.

      • Jake ==> Mosher’s (BEST’s) method is exhaustively laid out in the 2015 paper titled “Berkeley Earth Temperature Averaging Process” (link is to the full.pdf). It makes quite a read. You will find quite a few surprises in there. See if you can find what the true error bar for all (every single one) of the krieged (their infill method) points is. (I found it — it will amaze you.)

      • Kip: Thank you, link followed and paper printed. I’m looking forward to figuring this out.

      • Kip,
        The real question is: do we know what the real error bars are for their final product (global average temperature)? I don’t think with the information we have that it is truly calculable.

      • Paul ==> No, we do not….the reasons we don’t know are legion — but start with the fact until the digital (electronic) automatic temperature age, all Daily Highs and Daily Lows were recorded in whole degrees — each actual indicating a range wide degree wide. A record of 79 degrees (C or F) meant, in practice, “a reading from the thermometer anywhere between 78.5 and 79.5”. Averaging the daily Hi and Daily Lo gives a number that is also a range, one degree wide. No amount of averaging gets rid of the fact that the original measurement results is a range, therefore any further processing also produces a range of exact,y the same magnitude. One has to add to that all others sources of error and uncertainty — unvalidated thermometers, changes in equipment or their deployment, vegetative changes, the height of the weatherman….
        N o tempoerture average before the digital age can have error bars less than +/- 0.5 degrees, and logically, they must be much larger to account for all other sources of error or inaccuracy.

    • “The observational data is used to create a spatial prediction of the unmeasured locations.”

      Anyone who thinks that you can make 2 observations 1000 miles apart, and from them predict what is happening at the mid point is no scientist.

    • [name-calling snipped by author – violates WUWT policy] Mosher,

      If you say that the average is 15.567, but the instruments are only accurate to + – 0.5, all you did was lie. The correct figure would be 15.5 + – 0.5 but somehow you never say that…

      • But 15.567 +/- 0.5 is more truthful. the temp could be between 15.067 and 16.067. if I say 15.5 +/- 0.5 then I am saying it is between 15 and 16 when it actually might be slightly over 16. It all boils down to if I think anyone cares about the .067 . It’s more illustrative for 15.75 +/- 0.5 though because it is more likely someone might care about the 0.25 even with the 0.5 accuracy.

      • David, If he said 15.567+/- 0.5, then I would agree with you, but he didn’t. People have been giving these in simple numbers, “15.567”. As we tell every child in freshman chemistry that means “15.567 +/- 0.0005” unless stated otherwise.

        In short. HE SHOULD HAVE STATED OTHERWISE.

      • David S. this is not correct. You just cannot add significant digits to a thermometer accurate only to +/- 0.5 degrees C. This is just inventing accuracy that the instrument does not possess, making up meaningless numbers. 15.5 +/- 0.5 C is the correct way to report this average. I am not sure that any “Climate Scientist” would ever admit this because they love to see those headlines about “Hottest Year EVAH” even though this is scientifically meaningless.

  3. The average of 49 and 51 is 50 and the average of 1 and 99 is also 50. I’ve posted that one a few times.

    The average temperature does not consider the Maximum and Minimum temperatures and as a result, much of the important information is lost by doing so. Here’s a U.S. map of color coded Maximum, summer temperature trends from NOAA’s Climate at a Glance which DOES give you the Max and Mins.

    You don’t get that result if you just look at averages.

    • Steve Case ==> Thank you for the input — I like the map and the data — totally obscured by the usually published 48 States Average Temperature graphs.

      • Thanks for the reply, I’m 73 this year and in the Milwaukee metro area . I can tell you that summers are definitely cooler and winters are definitely warmer than they were a few decades ago. The weather or climate if you want to go that far is definitely milder. In terms of weather extremes that we often hear about, they are going to have to start talking about extreme mildness (-:

      • “they are going to have to start talking about extreme mildness (-:”

        About five years ago there was a minor uproar here over GISS or NOAA (IIRC) doing just that in its press releases.

    • But don’t the warmists claim that the effect of CO2 forcing is to increase nighttime minimum temperature, not daytime maximum temperature. If Steve Case’s info is correct, and the warmists are correct, should we not be seeing a gradual decrease in the range of daily min-max temperatures? Do we?

      • DHR – maybe; but many OTHER things cause the same effect. Long term min max’s show the same thing for ‘some’ areas, particularly cities (UHI?) but clouds and a number of other things can do the same thing. Correlation is one thing but how do you show Causation when there are so many variables. One of Bob Tisdale’s books covers this issue in some detail with some good graphics of convergence/divergence by year and by latitude.

      • 120 F in Phoenix and 115 F in Tucson today. Not hotter than ever and not near the average.

      • I think you’re spot on. Global warming caused by more greenhouse gases should cause overall temperature increases with a decrease in day-night temperature differences.
        Warming by the sun, or other outside sources, should cause an overall increase temperature with the day- nignt temperature difference ratio staying about the same.

      • Alan McIntire June 20, 2017 at 5:28 am:

        Global warming caused by more greenhouse gases should cause overall temperature increases with a decrease in day-night temperature differences.

        Shouldn’t it be the other way around?

        The greenhouse effect works by absorbing a more-or-less constant fraction of the outgoing surface radiance and recycling a more-or-less constant fraction of that back to the surface, thereby recycling a more-or-less constant fraction of the outgoing surface radiance back to the surface overall.

        But there is greater outgoing surface radiance on the warmer day-side of the planet than on the cooler night-side, so the recycled fraction must also be greater on the day-side, thereby causing a greater surface warming on the day-side than on the night-side. If in fact the night-side is warming faster than the day-side, then I think that could not be caused by an increase in the strength of the greenhouse effect.

        The same argument applies to all areas of the surface that are warmer or cooler than others, of course, and the greatest difference in warming-rates due to an increase in the greenhouse effect should arise between the tropics and the polar regions, with warming in polar regions being slightest and warming in the tropics being greatest.

        I have never understood the alarmists’ idea that the greenhouse theory somehow implies that warming should occur faster on the night-side and in the polar regions of the planet. It seems the direct opposite of what the theory actually predicts to me.

      • In reply to RP. I originally started to investigate this issue when I read some posting purporting to prove the Stefan-Boltzmann law “wrong” based on lunar temperatures. You might find this link, regarding Newton’s law of cooling, of interest.

        http://www.ugrad.math.ubc.ca/coursedoc/math100/notes/diffeqs/cool.html

        The law gives this equation:

        T(t) = Ta + (T0 -Ta)*1/(e^kt)

        Where T(t) gives Temperature, T, as a function of time, t,
        Ta is ambient background temperature, and T0 is the starting temperature of the body warming up or cooling off.

        mass atmosphere = 5* 10^18 kg=5*10^21gm
        temp atmosphere 255K (effective radiating temp to space- underestimates heat content of total atmosphere)
        specific heat 1.01 joules/gm C
        5* 10^21*gm1.01 joules/gm*255 K= 1.288 * 10^24 joules

        radius earth = 6400km= 6.4*10^6 meters.
        area earth = 4 pi r^2 =514,718,540,364,021.76
        240 watts/sq meter = 240 joules/sec per square meter
        60 sec/min*60 min/hr*24hr/day=86,400 secs per day

        5.147* 10^14 sq meters*240 joules/sec/sq meter *8.64*10^4 secs/day= 1.067*10^22 joules per day radiated away
        1.067*10^22/1.288*10^24 = 0.83%

        So the daily loss of heat of the atmosphere is less than 1% per day. That makes sense when you realized that although surface
        temperatures may swing by 20 degrees K or more during the 24 hour day/night cycle thanks to direct radiation to space from earth’s surface, meteorologists are still able to make fairly accurate estimates of daily highs and lows for about a week- because of that temperature stability of the atmosphere. Since the temperature for most of the atmosphere remains about the same throughout a 24 hour day, we continue to get the same daytime radiation from the sun, but the radiation from the atmosphere increases by the same amount both day and night. Since temperature is proportional to the fourth ROOT of radiation, that implies more warming at night than during the day from additional greenhouse warming.

      • Alan McIntire June 20, 2017 at 4:06 pm.

        Thanks for explaining your thinking behind your proposition that additional greenhouse warming will produce more warming at night than during the day.

        I must confess that I found your argument hard to follow. I cannot see the relevance of Newton’s law of cooling, nor your subsequent calculation of the percentage of atmospheric heat content that is radiated to space daily, interesting as these items of intellectual stimulation may be. In fact, your complete argument seems to be contained in your last two sentences, where you say:

        Since the temperature for most of the atmosphere remains about the same throughout a 24 hour day, we continue to get the same daytime radiation from the sun, but the radiation from the atmosphere increases by the same amount both day and night. Since temperature is proportional to the fourth ROOT of radiation, that implies more warming at night than during the day from additional greenhouse warming.

        I have highlighted what seems to me to be your key-assumption in the quotation above. Where have you got that from? How is it implied in the theory of the greenhouse effect? I can see no rational justification for it.

      • note that equation
        T(t) = Ta + (T0 -Ta)*1/(e^kt)

        During a 24 hour day, the earth receives about the same amount of radiation from the atmosphere, call it
        X, which will be some fraction of the average daily radiation we get from the sun.
        The warming we get from the sun changes constantly, but take an average for the 12 hour daytime
        period and call it 1.
        In this case, the ambient background temperature would be proportional to
        (1+X)^0.25 thanks to the Stefan-Boltzmann constant.

        Then during the daytime, the earths surface starts out with T0 less than Ta and starts
        warming up trying to reach Ta.

        At night, the ambient background temperature, Ta, becomes directly proportional to
        (X)^0.25 thanks to that same Stefan-Boltzmann constant. At night, T0 at earth’s surface is WARMER
        than the ambient background temperature, Ta,so the earth starts to cool off to a temperature
        proportional to the fourth root of X.

        Now throw in additional greenhouse gases so the radiation from the atmosphere to earth’s surface
        increases to X +Y, where X+Y are positive numbers. If they’re not both positive numbers, that means the greenhouse effect can cause COOLING. As an aside, it DOES cause cooling in some frequencies on Saturn’s moon Titan, at Titan’s low temperature..

        Now compare the (new daytime ambient temperature/old daytime ambient temperature) and compare it to
        (new nighttime ambient temperature/old nighttime ambient temperature.

        You’ll get
        (1+X+Y/1+X)^0.25 for new daytime Ta/ old daytime Ta, which will ALWAYS be smaller than
        (X+Y//X)^0.25.

        Try it for any numbers.
        The rest may be very boring and you can skip it

        Check the article on newton cooling and you’ll also see how
        to derive the k experimentally from 1 hour of cooling, assuming you have a local climate and not local weather. Although the air gains or loses heat on the order of less than 1% per day, clouds make up about 1/6 of the total greenhouse effect, and cloud weather can fluctuate by plus or minus 50 watts or so over short intervals, giving us weather rather than climate.

        I couldn’t solve T(t) = Ta + (T0 -Ta)*1/(e^kt) exactly, with Ta constantly changing, but I tried a numerical approximation.
        With the sun constantly changing elevation and radiation, I picked an average latitutde, 30 degrees, and
        earth at the equinox, I looked up a table of sun angles at various times in a 12 hour day,
        used 12 hourly figures for various Ta amounts, and tried to get a reasonable figure for
        warming.

        Before I could solve the daytime figure, I got a number for e^kt. I got that by picking
        average sundown temperature T0. ambient background temperature, I used Trenberth’s figure of
        333 watts
        to get ambient background temperature.

        The Stefan-Boltzmann equation for a blackbody goes
        T(degrees Kelvin) = S(constant)*(watts/square meter)^0.25. Our first step is to find that S constant.
        Doing a google search, I find 1000 K implies a blackbody flux of 56790 watts/square meter.

        T = 64.77867 W^0.25. I plugged in 333 nighttime watts using Trenberth’s figure, and got
        Ta nighttime ambient temperature is 276.72 K.
        I plugged in T0, average sundown temperature of about 293K

        Giving

        T(t) = Ta + (T0 -Ta)*1/(e^kt)= 276.72 +(293 -276.72)/(e^kt)

        I needed an additional figure, T(t) for a period of hours, then I could solve for k.
        I got something like k=1.07053. I made the assumption that the same k works for daytime as for nighttime, otherwise the problem would be unsolvable.

        I played around with daytime warming, nighttime cooling, and finally got a balance.
        I plugged in additional greenhouse warming, and got a higher balance both daytime and nighttime, with about 1/3 of additional average warming happening during the day, and 2/3 of the additional warming happening at night. In doing so, I convinced myself that the “Stefan-Boltzmann” deniers didn’t know what they were talking about.

    • Steve ==> Can you send me a link that brings up that particular map? Or did you compile it from data at NOAA?

      • Kip
        Compiled from NOAA Climate at a Glance data. I saved a text file of all the states June through September Max temperatures Alabama to Wyoming:

        Alabama, Maximum Temperature, June-September
        Units: Degrees Fahrenheit
        Base Period: 1895-2000
        Date,Value,Anomaly
        189509,89.2,-0.2
        189609,90.7,1.3
        etc. [over 6000 lines]

        Contact me at: [ stacase at hotmail dot com ] if you would like that file

    • Let’s call this map a “Beam of Dusk.” The vast majority of States have so many climate regions that a “State average” is almost as meaningless as a “National average.”

      I’d like to see one that breaks it down to “temperature trend by county,” like the median income one – that would be far more meaningful. (I would make a small side bet, thanks to UHIs, that it would be almost identical to the income map, too, if you used the same coloring scheme.)

      • @Kip, seen many just like that one. Nice for knowing the general climate of an area – but useless considering the grid size (any specific place within the region may be very little like the “regional” climate).

        However, if overlaid on the (at least county scale) trend map that I really want – hmm. Some very interesting things might pop out.

      • One of the problems with gridded data is the loss of detailed geographical influence. If you look over at Climate Audit, Nic Lewis has an interesting discussion on the influence of grid scale on estimates.

      • @Duster – that’s one of the basic problems with “climate modeling.” The current smallest model grid sizes are (barely, and that is arguable) good enough for a “proof of concept” model. Nowhere near anything useful. At ten kilometer resolution, you might start to get some useful results. Somewhere between one and five kilometers, you’d have a pretty good model, assuming everything else was correct… (Slight addenda here – some regions require subgrids with much finer resolution than one kilometer. And some others could be as much as 100 kilometers without much loss of information. This ignores the vertical dimension, too, which I believe will need to be something like 100 meters resolution at the very least.)

        We don’t have that kind of resolution in measurements for model validation, by any means; nor do we have the processing systems to handle such a model. Which is why current “models” are completely useless.

      • Kip Hansen. (Writing to Observer)

        Observer ==> There is this, the Koppen Climate regions map of North America:

        A good display of regional (multi-state area) “climates” across the United States and Canada – using NASA/Hansen’s worthless and misleading Mercator Projection rectangles, but I digress.

        Now, show the “pre-CO2” current world Global Circulation Model RESULTS and the grid rectangles that define the results of the 24 models that duplicate that map – both land and oceans and mountains and plains – to 0.20 degree accuracy.

      • RACookPE1978 June 20, 2017 at 4:39 am
        Kip Hansen. (Writing to Observer)

        Observer ==> There is this, the Koppen Climate regions map of North America:

        A good display of regional (multi-state area) “climates” across the United States and Canada – using NASA/Hansen’s worthless and misleading Mercator Projection rectangles, but I digress.

        Did Hansen use Mercator projections, the copies of his papers that I’ve seen didn’t?

    • “….and the average of 1 and 99 is also 50.”

      I don’t think anyone at the Fed will ever admit this, but I believe one of the reasons that everybody and their brother missed the looming financial crisis was that economists and policy makers relied on National home price and foreclosure statistics rather than drilling down and looking at the discrete elements. We did not have a homogenous foreclosure problem. There were housing markets that should have been telling those in Washington that a huge bubble was forming while in other markets only above average price increases were experienced.. When the foreclosures started, they were not uniformly distributed across the country. California foreclosure rates increased by 2000% between 2005 and 2009. Florida and Arizona and a few other states had very large increases as well. At the same time other states had less than double the foreclosure rates. A few states had barely a blip of foreclosures and nothing out of the ordinary.

      Looking at the National averages of home price increases and foreclosure rate increases did not disclose what was actually happening throughout the country. Sometimes averages don’t tell us a thing.

    • Steve: What is a “maximum summertime temperature”? The average high temperature EVERY DAY for ALL of stations in a state for JJA? The single highest recording on ANY summer DAY at ANY STATION in a state? What is and isn’t a “significant” decline over what period in this map?

      I found my way to this website, but it still wasn’t clear how to reproduce your map
      https://www.ncdc.noaa.gov/cag/

      • Frank ==> If you are going to participate here, you really must read at least the associated comments before trying to kick the barn down.
        Steve gave the information you are asking for in a comment above with a kind offer to share the file he compiled of the data gathered at NOAA’s NCDC.

      • Kip: Thank your for pointing out the comment above, which I missed. Since the output is about 6000 lines, that would be one high record per state per season per each of 105 years. FWIW, I suspect that many record temperatures are due to artifacts. I also suspect that the number of artifacts is gradually decreasing with time. With a dozen or dozens of temperature readings being taken in the average state every day, I doubt there is anything useful to be learned from a relative handful of extremes. Perhaps a future article will prove my snap judgment incorrect.

        Respectully, Frank

  4. Good commentary. I took statistics some 40 years ago, and this has been a rather better exposition on the faults of statistical measures than the class was.
    The great advantage of the internet is that it is cheap and easy to show the actual graphs of the data one is using the statistical measures to describe. A skewed or bimodal distribution can be shown directly, rather than just describing it.
    What seems to matter with global temperatures is the geographic distribution, not a single average. If, as appears to be the case, that “warming” is almost entirely in high latitudes, one can show that, rather than just use not entirely clear descriptions of a map.

    • The calculated global average temperature is useless because it is averaging source areas (apples) with sink areas (oranges). The mass balance used in the models is even worse because it assumes that natural source and sink rates balance out (a false assumption). The natural source areas for CO2 produce around 20 times anthropogenic emissions.

  5. I learned this by judging High School Science Fairs and later reading the claims made in many peer-reviewed journals.

    One of my buddies had a thesis adviser who grumped that students, given a formula, would attempt to use it no matter how inappropriate it was for the situation.

    Engineers usually get that beaten out of them before they get their licenses. I suspect that some scientists never do.

    As you observe, just because someone can use tools to generate a number, it doesn’t mean that they actually understand what they’re doing. On the other hand, they are likely to think they know what they’re doing … and that’s a very bad thing.

      • One of the things I used to do in a previous job or two was to make sure that the young engineers understood what they were doing. Just because they put correct inputs into a properly functioning computer simulation doesn’t mean that the answer the computer spits out is not stupidly wrong.
        There is a lot of difference between being a good engineer and a data-feeder to a simulation.

      • My father was a career civil engineer of some note, and worked well past normal retirement age because he enjoyed the work. When he finally left the field at the age of 70, I asked him “why now”? Being a “real” engineer, he always spoke carefully …. he paused and said “I’ve grown tired of teaching new employees how to do the job correctly.” I’ve never forgotten that conversation …….

    • I managed several hundred scientists whose single biggest problem was the misapplication of statistical models. Often they would collect data then apply some statistical model out of a textbook or paper to their data whether or not their data met any of the model’s assumptions. We brought in first on a part time basis and eventually hired an expert statistician to review all experiments and project prior to the project being approved. We still had scientist try to mis-use statistics.

      • Ed ==> Boy oh boy — that’s the ticket! The Stats guys really should be involved from the very first with study design so that the data gathered will be amenable to certain statistical tests and analysis. That and study design should be pre-registered — even peer-reviewed — before any work is started.

  6. “Averages are good tools but, like hammers or saws, must be used correctly to produce beneficial and useful results. The misuse of averages reduces rather than betters understanding.”

    This was a wonderful essay and I intend to steal great portions of it to use in my own classes to illustrate the issues. Thank you for doing theses two essays and the one yet to come.

    I think the part above that I quoted is very important for everyone to understand. I read “How to Lie with Statistics” as a student decades ago as well as having taught the subject. I think that people should understand how one can be mathematically correct while still pulling the wool over your eyes and misleading you. Your essay makes that abundantly clear. Bravo!

      • I have taught in college, high school, and middle school. I am doing private middle school at present. (so much easier than the others now that I am so old)

      • Kip, I teach AP Chemistry at the high school level, with your permission I will, like Mark, include these essays in my teaching.

      • Jake ==> of course, anything you find useful. Helping kids to understand these issues at an early age prevents their being mis-educated later on….

    • I remember a friend making the point that after Three Mile Island, the claim was made that within a given radius of the reactor the average radiation exposure was a trivial amount above the background. What wasn’t acknowledged was that most of the radiation was in a downwind plume, and it was of consequence. Thus, the elevated exposure was hidden with an average.

      • Clyde s,
        That sounds sinister.
        There was no actual harm done from the plume & residue.
        Geoff

      • Geoff,
        Whether harm was done or not, there was an attempt to use averages to down play the seriousness of the release. That is the essence of Kip’s article.

  7. I had this average problem with my launch monitor. For instance if i hit a six iron i might get a series like 210 213 215 208 143 200 197 204 214 135 yds. It software calculates an average of 194 yds. But I would not play the club that distance. The median is a better number 208 or 210 yds to play the shot. Unless something is normally distributed the average doesn’t mean a whole lot. I told the company about this but they laughed at me.

    • Jamie ==> Thanks for weighing in — I’m afraid golf (it is golf, isn’t it?) analogies go over my head.

  8. When the global average temperature anomaly is an average of 8,000 estimated temperatures, and not a single measured temperature, the situation is even worse…

    • Michael S. Kelly ==> Yes, averaging averages of averages…..and yes, many of them estimated or in-filled at best.

      • Kip, I suggest that, in the future, you not refer to those sets of estimates (adjusted or infilled) as “data sets”, which suggests that they are something they clearly are not.

        The consensed climate science community appears to believe that it is the modern day successor to Rumpelstiltskin, able not only to spin straw (bad data) into gold (good data), but also too spin nothing (missing data) into gold as well.

  9. It seems to me that this ‘number crunching capacity’ without the traditionally expected understanding of ‘why and how’ is a manifestation of a common problem perhaps best exemplufied by the ‘uncertainty principle’ in quantum physiscs. That is. just like speed and position are two paramaters inversely related in terms of accuracy in quantum physics, number crunching capacity and deeper mecahnistic understanding are also at risk of a similar dochotomy as the former is made available to the less highly technically educated portion of the populace. Project that onto the mainstream media and it is clar what happens. This issue has effectively been with us for ever. Traditionally the ‘shamans’/’seers’/’witchdoctors’ etc saw patterns in the environment/stars/entrails and developed an authoritive modus operandi to articulate their case… and enhance their power. SOund familiar now?

    • M eward ==> Yes, very familiar….turning only vaguely understand science, like physics, in to New Age mumbo-jumbo. Very popular in Hollywood.

  10. Very interesting article. My only cavil so far is in the coverage of average income. The average income within a state or county is meaningless to the individual; I am not comforted in my relatively low position on that scale by knowing that my city or my state is well above average and, thus, far ahead of me. My comfort as an individual is that, if I am now in the lowest quintile of average income, I am by no means required to stay in that low position. In fact, if I have read correctly, the vast majority of individuals who are within the lowest quintile in a given year will be in a higher position within a few years–often, a much higher position. Thus, the use of averages says nothing of economic mobility, which is an essential part of the American dream. Yes, I say to a struggling young couple, you are in strained circumstances now; but keep working, and in a few years your situation will be far different.

    • John M. Ware ==> The beauty of the American system is that one is not born remain in poverty simply because one’s parents were poor. Careful reading of my essay reveals a bit about my family background — neither of my sets of grandparents were anything but dirt poor, hard working people — my maternal grandparents didn’t have electricity in their home until well after WWII. My children grew up below the Federal Poverty Line, but now lead lives of semi-independence from daily labor. I have been retired and serving in humanitarian work, self-funded, for the last 13 years.
      Thank you for your optimism.

      • Brings up another point – even that map is seriously flawed – not your fault, sir. The “real dollars” in one place are not worth the same as the “real dollars” in another – but are calculated from a single national average of inflation. First failure. The second one is that it is “income” – not “disposable income.”

        I took over the payroll system from another developer in my company – and we both noted that his salary was just about 150% of mine, although we had the same position, title, and he only had about six months in seniority. BUT! He lives in Los Angeles, I live in Tucson. His “real dollars” were not worth nearly as much as mine when going shopping – and his disposable income after California’s punitive taxation was barely 10% higher than mine. Taking all factors into account, we figured that he was actually worse off than I was! (Yes, he did leave the company not long afterwards…)

        Both of us would have been below the real poverty line, of course, if we had been in NYC or San Francisco.

        Which also illustrates that even a county-grained temperature trend map would need to be treated very carefully – even for urban areas. Just taking population of the county would be incorrect – one would need to use the population density to get close to useful information – and even then, there would be those that are seriously inhomogeneous where density is not a good parameter, like mine, where there is a single heavily urbanized area (albeit growing), but the majority of the area is not urbanized, and almost certainly never will be, between the Reservations and the Air Force practice bombing range.

      • Another aspect to note about income disparities by quintile is that, over time, you would never expect much growth in real income in the bottom quintile. Why, for example would the labor of a recent high school graduate or recent unskilled immigrant command more relative value twenty years from now than they do today? That’s only going to occur if demand for unskilled labor grows over time, and I’m not sure you would even want that in a developed economy. Want you want to see is growth in the upper quintiles which, along with even a modest amount of generational and educational economic mobility, is a sign of significant growth in economic opportunity.

      • Kurt -interesting point. I understand you are saying that there will be low value, unskilled jobs and the value of these should not change. But why would we not want even the lowest value jobs to be adding more value as time goes on? A few generations ago a farm laborer could plough a small field in a morning, but now can do a huge field in the same time. The factory worker of a couple of generations ago could produce only a fraction that a factory worker today produces.

      • seaice1

        Kurt -interesting point. I understand you are saying that there will be low value, unskilled jobs and the value of these should not change. But why would we not want even the lowest value jobs to be adding more value as time goes on?

        You are forgetting the federal and state governments began, under democrat idols Kennedy and Johnson, to subsidize poverty and break up the family structure known since the Babylonian-Assyrian era’s by their perpetual welfare-rewarding state benefit structure.

        So, with now some 20% of US households on welfare (under its hundreds of guises and programs and agencies), and with welfare rewards ratcheted up every year by the identical cost-of-living raises that come from the near-equal yearly inflation metrics, is it no surprise that the same 20% percent of households are at poverty levels now as when the “Grate Society” began in 1965 to pay people in poverty not to work harder?

        A few generations ago a farm laborer could plough a small field in a morning, but now can do a huge field in the same time.

        Thanks to an intelligent use of readily-available fossil fuel use to fabricate the tractors, harvestor, trucks, processing plant, storage facility, and shipping centers. Oh wait! The CAGW crisis means we must restrict fossil fuel use …. and starve millions to death.

      • seaice1,
        I think you missed the part where he said “relative” value. The relative value of the work done by people in the first quintile 50 years ago was about the same as today. And it says nothing about mobility – the ability for people to move up. In a free society, people will tend to rise up the rankings according to their abilities and ambition. And having a large absolute spread means there is more opportunity for people to move up; the sign of a robust economy.

      • Paul, yes he did mention relative value, but he also said “you would never expect much growth in real income in the bottom quintile”. We have often seen growth in real income in the bottom quintile over the lasy 100 years or so.

      • garymount ==> Very interesting, another commenter brought up the economic mobility issue. The study referred to in your comment is Measuring
        Income Mobility in Canada, 2016
        (this appears to be an update of the 2012 report).
        If you are very interested, perhaps you could read it through and tells us why it is misleading….it took me only two minutes to see what I think is the fatal flaw.

      • Kip, I must confess to not yet having read the whole report, but the approach is interesting comared to the graphs you show for USA. This is due to following individuals as they move through the quintiles, rather than focusing on the individuals who happen to be in the quintiles at any particular moment.

        Thus the Canadian study revealed that absolut income for those they labelled “lower quintile” has risen by a staggering 781%. Your graphs show the bottom quintile stagnating, or actually showing a declining share of the income. That could be because the people in that bottom quintile are totally different individuals 10 years later.

        The two stories are not necessarily conflicting, as it would depend on the level of mobility. I would be very interested to see a similar study from the USA to see if there was a similar mobility. My suspicion is that mobility would be less, but I am prepared to say that is based on information about social mobilty in the USA that is anecdotal rather than systematic.

        I certainly did not spot a fatal flaw, nor why it was misleading. Although I must confess that at first reading I thought it was telling a very different story to the one you tell through the graphs in the article.

      • seaice ==> The flaw I see, a first impression, is that many of the individuals they track are individuals who are new to the workforce — read “teenagers” or “college students working part-time” or even “single mothers first couple of years returning to the workforce”. Of course these people work their way up the economic ladder, especially the teenagers and the college kids —
        but there are those families who struggle along year after year — the real poor, not just those with entry level jobs, but fathers of families stuck at the bottom.

      • Kip. The studies are detecting different things. The Canada study describes the trajectory that people often have – that is starting with a low paid job and then moving up, then eventually moving down again. The important factor is how many people fail to move up – families or housholds that seem stuck in low grade jobs. The problem of averages again, which can obscure important factors . It would be very intersting to see a similar study in USA. As I said, I suspect there would be less mobility than in Canada. It would also be interesting to see the quintile income in Canada – has there been an increase in the lower quintile that has not occurreed in the USA?

  11. A well written essay with many good points, but, unless you are arguing semantics, the sentence “The Earth is not a climatic area or climate region, the Earth has climate regions but is not one itself.” is false. All of the “climate regions” of the Earth are connected and continuously exchanging energy and are, on average, one climate system.

    • Slipstick ==> My point is that the whole Earth does not have a “climate” of its own, except in the very vague sense used by astronomers to classify exo-planets as having “Earth-like climates”. Climate Regions are parts of the whole, but the whole is not a Climate Region in and of itself. Some might have other opinions on this — but to try to define the “Earth’s climate” one has to average heterogeneous climate regions which are basically incommensurable.
      In the merchant marine, we used to train recent hires that “Ships have Boats — but the Ship is not a Boat, it is a ship.” Not sure if this is true in other languages…. but you get the idea. The definition of a ship is ”
      a vessel larger than a boat for transporting people or goods by sea.” Ships have life boats and other boats (launches, sea sleds, etc) aboard them that are smaller.

      • Kip Hansen,
        Since all climate regions have the same constituents, which are free to, and do, move to other climate regions, and all share common sources of energy, I cannot identify a single property of any region which would make it incommensurable with any other region. Can you provide an example? As I mentioned before, I suspect your objection is one of semantics, as implied by your “ship vs. boat” analogy, and, in fact, has no physical basis.

      • Slipstick ==> Give me a description then, if you please, of Earth’s Climate. Compare to the methods used in the Koppen Climate Region classifications. What is the climate average of the Sahara Desert and the Appalachian Mountains? How would that average make sense?

      • I think there are two distinct issues that are being conflated. I’m not sure that anyone refers to an average “climate” of the Earth, though they do measure averages of state variable related to climate, e.g. temperature, total precipitation, etc.

        To the extent, for example, that “average” temperature increases are being used to describe the physical effect of CO2 on the Earth’s biosphere, I wholly agree that this is improper since there is no “average” climate for which an increase in average temperature would be representative. But the average global temperature might still be useful as an investigatory metric to try to relate CO2 emissions to the change in total energy stored by the climate over time, since the spatially averaged value of global temperature has to change over time as a function of increased CO2 concentration. I say “might” because the devil is always in the details, but I wouldn’t agree with a blanket assertion that globally averaged temperatures conveys no useful information, or globally averaged precipitation conveys no information, etc.

      • Kip Hansen,
        By citing Koeppen-Geiger you’ve proved my point. The classifications are based on the ranges of temperature and precipitation in each region, properties that are certainly commensurate. Tell me, which of cloud cover, precipitation, and temperature, humidity, air pressure, atmospheric gas mix, particulates, wind speed, etc., at various altitudes, are incommensurate between the various climate regions of the Earth and why I can’t take an average of each of these values to produce a representation of the entire Earth’s climate.

      • Slipstick writes

        why I can’t take an average of each of these values to produce a representation of the entire Earth’s climate.

        Kip has stated that mathematically you can but its meaningless. What meaning would you attach to the average of the rainfall over the the Sahara Desert and the Appalachian Mountains?

        Perhaps a better example is what meaning would you attach to the average of the temperature of the Sahara desert and the bottom of the Mariana trench?

      • You mention rainfall. This is somethng we can in principle measure and total for the globe. The term “average rainfall” may not be useful for weather forecasting, but surely it is a parameter that does have some meaning?

        Say the total rainfall over the Earth changed significantly. This is surely a change in global climate. We would need to break it down to regional effect to assess the impact of this hypothetical change, but does everyone think that this is a meaningless measure?

      • “Earth does not have a “climate” of its own, except in the very vague sense used by astronomers to classify exo-planets as having “Earth-like climates”
        You have said this is vague, but is it really? It seems a very important factor. A non Earth-like climate means it is uninhabitable. That may be “broad brush” but surely indicates that global climate is a hugely important factor and discussion of global cllimates a very important thing.

        It seems a bit parochial to dismiss this concept because we are so used to thinking about very local weather

      • seaice writes

        Say the total rainfall over the Earth changed significantly. This is surely a change in global climate.

        But since an increase in the precipitation cycle means an increase in enery transported to the upper atmosphere and decrease in surface temperature, one might wonder whether its even possible in climatic time frames.

      • Tim,
        “But since an increase in the precipitation cycle means an increase in enery transported to the upper atmosphere and decrease in surface temperature, one might wonder whether its even possible in climatic time frames”

        Whether you think this is possible or not, surely it would be a meaninglful global climatic parameter. For example, if your theory says it couldn’t happen, but measurements showed it did happen, then that would require a modification of your theory. This would be a global climate parameter that would be very important to your model.

        My point is that global totals or “averages” are real and important parameters that are worth measuring.

      • seaice writes

        For example, if your theory says it couldn’t happen, but measurements showed it did happen, then that would require a modification of your theory.

        So for example “Detection of human influence on twentieth-century precipitation trends”

        http://www.nature.com/nature/journal/v448/n7152/abs/nature06025.html

        States in the abstract

        Human influence on climate has been detected in surface air temperature1, 2, 3, 4, 5, sea level pressure6, free atmospheric temperature7, tropopause height8 and ocean heat content9. Human-induced changes have not, however, previously been detected in precipitation at the global scale10, 11, 12, partly because changes in precipitation in different regions cancel each other out and thereby reduce the strength of the global average signal13, 14, 15, 16, 17, 18, 19.

        And go on to say

        Here we compare observed changes in land precipitation during the twentieth century averaged over latitudinal bands with changes simulated by fourteen climate models. We show that anthropogenic forcing has had a detectable influence on observed changes in average precipitation within latitudinal bands, and that these changes cannot be explained by internal climate variability or natural forcing.

        So in this case they’re saying that they’re expecting changes from the models (yeah, whatever) and that those changes can be detected and changes in location.

        So, no there are no measured changes in amount, only location and averaging hides the detail anyway.

        I guess this example kills two birds with one stone in this thread.

      • “and that these changes cannot be explained by internal climate variability or natural forcing.”

        The ,b>authors cannot explain the changes by internal variable or natural forcing…

      • Tim, my point is that this aspect of global climate has real and measurable meaning. I am nt discussing whether such a thing has happened, but making the point that global climate does have meaning.

      • I think the point is that a single number for “climate” is nonsense. If the tropics stay much the same temperature, and the high latitudes warm, that would be rather a benefit. Assuming equal warming everywhere is deceptive.

      • Seaice writes

        my point is that this aspect of global climate has real and measurable meaning.

        The question of global precipitation rate has meaning and is measured by adding the global precipitation. Whether you average it as a per surface area per time value or not makes no difference. But at least the number has meaning.

    • As you describe the situation Slipstick, you are describing weather, not climate.

      Supposedly, climate is weather in the long term. But one is not describing weather patterns anymore and the generic “climate” descriptions are only applicable to few locations. A nearby location 340 meters higher or lower will have different conditions.

      Swaging data to represent up to 1,200km swaths of land from a minimal number of other sensors misrepresents vast swaths of the world; both in weather and climate.

      • ATheoK,
        Actually, you are describing weather. What I am referring to is using all measurements from all instruments in a region, or over the entire Earth, over a significant period of time, to produce a single average.

        To your point, since the weather in a region is moving, if you average thousands of measurements from that minimal number of instruments in the 1200 km swath, you will have an accurate representation of the average in that region.

      • Slipstick June 19, 2017 at 10:31 pm

        You seem to be thinking of one factor of weather over a homogeneous region, such as precipitation from thunderstorms passing over the state of Oklahoma. Each storm would only drop rain along a narrow strip, but over time all parts of the state might get passed over by an approximately equal number of storms. In this situation, measurements of rainfall made at different stations might over time have similar totals. In this case, what would be gained by averaging? 5+5 /2 = 5 No information is gained.

        ATheoK is thinking of a place such as the state of Oregon, which is divided by a large mountain range. The typical weather pattern is moisture carried East from the Pacific by the prevailing winds is precipitated primarily on the west slopes of the Cascades. Land to the east of the Cascades typically receives 1/6 the rainfall of land to the west of the Cascades. In this case, what would be gained by averaging? 60+10 /2 = 35 This is like averaging apples and oranges, or calculating the average sex of a local. Only misinformation results.

        SR

      • Specious claims slipstick!

        You assume gross aggregation of numbers can be divided by sources and the result is an “average”.

        Meaning that you did not seriously read Kip’s article above and are inventing strawman to distract from Kip’s points.

        Nor are temperature averages true averages.
        They are akin to saying something like:
        • My child’s average height is.
        • The average mountain is.
        • The moon’s average distance is.
        • The average cow is XX tall.
        • The average alarmist’s education is.
        • The average whale has xx calves…

        What is being claimed as individual temperature measurements is only valid for one very small location. Whereas near conditions can be quite different
        • A temperature six miles away, in an urban area is higher.
        • A temperature 330 meters higher or lower is a different temperature.

        And completely ignores:
        • Winter temperatures are much different than summer temperatures.
        • Night temperatures, morning temperatures, evening temperatures.
        • Temperatures when the wind changes direction.

        Just what does “average” temperature convey?
        • That someone can make mud out of dirt?
        • That someone can add and divide? Anything?

        This is before taking into account, ongoing absurd adjustments without full metadata!
        • Well, before understanding that temperature sensors are not identical, in most cases they are not similar!
        • well. before understanding that temperature sensors are not tested and certified in the field,
        • temperature sensors are not run in parallel before replacing,
        • sensor stations are not equivalent in any sense of the word,
        • sensor stations are improperly positioned and local maintenance ignores station placement,
        • sensor stations are frequently infested with wildlife,
        etc etc etc.

        The litany of abuses summed into temperatures makes accurate temperatures farcical. Summing and dividing bad numbers does not eliminate nor correct for egregious behavior, assumptions or false claims.

        You may call such abuses “averages”, that does not make said numerical abuses valid nor representative of temperature.

  12. Thanks Kip. This is another well written and accessible article on averages. Science will eventually prevail but not until the data fiddlers and propagandists have had their way with it.

    To those who are pessimistic I am reminded of the story about the man walking along the water’s edge throwing stranded star fish back into the water. Somebody said to him that what he was doing was futile because there were so many star fish needing to be rescued so he could not possibly make a difference. He said nothing at first but picked up a star fish and threw it into the water. He then responded that he had made a difference to that one.

    That is how science will eventually prevail and the climate wars will be won. Everybody should do what they can.

  13. In 1995 we moved from Andover, MA to Fort Wayne, IN — working at virtually the same salary as before. It was amazing how much further the same salary went at the midwestern cost-of-living we now enjoyed. The rents were much lower for exactly the same type of neighborhood, the auto insurance was cut in half, and on and on. I really think your income comparison should be weighted or at least analyzed in terms of cost of living when talking about salaries in different parts of the country.

    • Douglas ==> Of course, it is not my income comparison, it is that of the US Census Bureau and the federal reserve. Local Cost of Living is certainly a factor in weighing income and family financial well-being.
      I don’t know of a group that puts out such a weighted metric, though it would be interesting to see something like County Median Incomes in “Dollars Adjusted for Local Cost Of Living.”
      I’ll repeat my disclaimer here: “I am not an economist — nor a national policy buff – nor interested in US Two-Party-Politics squabbles. Please keep your comments to me to the question of the uses of averages rather than the details of the topics used as examples. “

      • Economists do that sort of adjustment for countries. You’ve likely encountered it. It’s called Purchasing Power Parity. My opinion. It’s not great, but it’s probably better than nothing. It could certainly be applied to states/provinces or counties, but it isn’t often done that I’ve seen.

  14. Kip: I think you are providing a valuable service with this series. I fully agree that the abundance of software available to run sophisticated data analysis such as ANalysis Of VAriance (ANOVA) has resulted in a focus on the numbers – p-values in particular – without bothering to look at the actual data or understanding what it tells us.

    I was a chief engineer in an independent testing lab for many years and trained engineers and technicians in measurement methods and data analysis. I taught that the first thing to do with a data set is plot it in any way you can think of that is meaningful – bar chart, histogram, x,y scatter, line graph etc. A good visualization of the data often tells much more than you can see just looking at numbers. I also taught that reporting an average without including information on the dispersion of individual values ( e.g. Standard Deviation, range, coefficient of variance, etc.) is statistical malpractice. So is not including a properly determine statement of uncertainty.

    Averages are the ultimate data reduction tool. They often reduce an interesting set of data to a relatively meaningless single number.

    Now, as to finding trends in long term time series data consisting of averages of data collected by varying methods and differing sampling schemes – that’s a much bigger can of worms. I suspect my personal hero, W. Edwards Deming, would look at what has been going on in climate science and weep.

    • Rick C PE ==> “…my personal hero, W. Edwards Deming, would look at what has been going on in climate science and weep.” as we all must. Until there is a field-wide review by outside experts, CliSci will remain a mess of enforced-consensus opinions subject to all the known biases: publication bias, noble-cause bias and in some, politically-inspired character assassination (ref: Roger Pielke, Jr, Willie Soon, etc).

  15. As noted several times here, averages often have little value in describing actual conditions. My favorite example: the average human being has one ovary and one testicle.

    Always consider averages skeptically when they are claimed describe the real world.

  16. Climate averages of temperature are too non linear to be meaningful considering that it takes exponentially more accumulated ‘forcing’ to maintain ever increasing temperatures owing to ever increasing emissions and the wide range of surface temperatures found across the planet. Surface emissions derived from temperature using the SB Law exhibit the property of superposition allowing meaningful averages of emissions when calculated across a whole number of periods, which in a climatic sense must be a whole number of years. Converting the average surface emissions into an average surface temperature is a meaningful metric in the same way that the planet average post albedo incident solar power of 240 W/m^2 (Pi) and corresponding average emissions of 240 W/m^2 of (Po) has an EQUIVALENT average temperature of about 255K.

    Superposition arises from the simple, linear, COE conforming differential equation that relates the instantaneous incident power, Pi(t) and the instantaneous outgoing power, Po(t) with the change in solar energy stored by the planet, dE(t)/dt, such that, Pi(t) = Po(t) + dE(t)/dt. E(t) is the solar energy stored by the planet and dE(t)/dt is its rate of change, which when positive, the planet warms and when negative, the planet cools. The surface temperature is linearly related to E (whose units are Joules), but not dE/dt (whose units are power), which mathematically is what the IPCC calls forcing and whose steady state value is defined to be zero. The property of superposition arises because COE tells us that any Joule is interchangeable with any other Joule and that if 1 Joule can do X amount of work, 2 Joules can do twice the work. This should be obvious since the units of work are Joules and it takes work to warm the surface.

    This DE gets more interesting when we define an amount of time tau, where all of E would be emitted at the rate Po and we can rewrite the equation as Po = E/tau + dE/dt, which any EE student will instantly recognize as the form of the Linear, Time Invariant differential equation that describes the non linear charging and discharging of an RC circuit, where tau is the time constant.

    If Ps(t) is the SB emissions of a surface at a temperature of Ts(t) and we know that Ts(t) is linearly proportional to E(t) and we can easily quantify the average NET surface to space transmittance, which includes surface energy absorbed and re-emitted by the clouds and GHG’s, as an effective emissivity of an emitting surface at Ts(t) whose emissions are Po(t), we can mathematically ascertain the sensitivity. When you simulate the equations including the relative relationships between Po(t), Ps(t), Ts(t) and E(t), the average sensitivity, that is the average change in Ts(t) per change in dE(t)/dt turns out to be less than the lower limit claimed by the IPCC. The quantifiable effect doubling CO2 concentrations has is on the surface to space transmittance which decreases by about 1.5% when CO2 doubles and is what the IPCC claims is EQUIVALENT to 3.7 W/m^2 of dE/dt while keeping the NET surface to space transmittance constant. That is, instantaneously doubling CO2 instantaneously decreases Po by 3.7 W/m^2 which is indistinguishable from increasing Pi by the same amount.

    The point of this is that for a climate system whose behavior can be expressed in terms of Joules and COE, averages and their relative changes are meaningful metrics and the fundamental reason is the linear behavior that arises when you require any one Joule to be interchangeable with any other one.

    • While I mostly agree with what you say, the problem is that argument assumes all the energy absorbed at the surface is transported to the TOA by radiation. This is simply not true. In a gas energy is also transported by conduction and convection. This is important because the CO2-energy-trapping mechanism is radiative in nature. So to the extent that some of the energy at the surface is transported via other mechanisms, there is less available for CO2 to “trap” in the lower atmosphere. Also, the results of the models are always reported in terms of temperature change, but converting from energy to temperature is not trival as it depends on a number of factors, probably dominated by water vapor content. How do they handle that? Seems like a good place to “tune” the output.

      As far as averages go, they can be useful and meaningful to some degree, but keep mind that we are talking about a data reduction method. You may gain insight into one particular metric, but at the expense of all the others. This is what is not generally understood and the source of many misconceptions/deceptions.

      • Paul,
        ” …assumes all the energy absorbed at the surface is transported to the TOA by radiation”

        This is not the case. It assumes that energy transported into the atmosphere by non radiative means plus its return to the surface, that is energy transported by matter, has a zero sum influence on the NET photons leaving the planet, thus the NET incident energy required to sustain the surface temperature, since whatever effect the non radiative transport in and out of the atmosphere has, its already accounted for by the resulting temperature of the surface and its consequential photon emissions of BB radiation. Trenberth muddies the waters by lumping in the return of non radiant energy entering the atmosphere as ‘back radiation’, but it is clearer of you subtract out convection/thermals and latent heat from the back radiation term.

        Converting energy to temperature (actually power) can be done trivially with the Stefan-Boltzmann Law. The complications arise from the atmosphere which fundamentally turns a black body surface emitter into a gray body planet emitter, but the emitting surface itself that is the virtual surface in direct equilibrium with the Sun, can be considered an ideal BB radiator and often is when processing satellite data, where the temperature of this radiator is an approximation for the actual temperature. Measurements show it to be within a few percent, moreover; it tracks deltas to an even higher accuracy.

  17. Excellent article Kip.
    And some well stated clarification specifics; e.g. commiebob and Steve Case.

    Chippewa bloodline here.

    Consider a normal person or any citizen.
    As a person they are unique, identifiable and bring specific knowledge, talent and skills.

    Averaged, as human face experts study, the person is no longer identifiable. Nor are the concepts of knowledge, talent or skills viable when people are averaged.

    There is a point, which social science people love, that when enough people are aggregated into a large population; that is when sociologists can “predict” behaviors, prejudices and biases.

    Yet that is still not an average. Amassing sufficient persons into large populations sums up their individual knowledge, skillsets and general mental conditions is not the same as an average.

    Take the group, divide by population and achieve per capita average that very few individuals resemble, even faintly.

    Averages are vague fuzzy entities. Sports statistics utilize averages extensively; but fantasy sports players will be the first to recognize that statistics fail to reveal the current situation. Exact knowledge of how when and why a person achieved their sports statistics is essential.

    Alleged averages concocted from disparate unique and separate sensor sources are not averages!
    Attempts to average unique separate sources requires;
    • a) Identical sensors. Same batch, same certification, same verification procedures post installation!.
    • b) Evenly spaced grid.
    • c) All altitudes, latitudes and longitudes are equally represented
    • d) Replacement sensors are run parallel sufficiently long for proper verification.
    • e) Infestations or other contamination are reasons to reject affected data! Never adjust!

  18. Here’s another thought for your household income maps and poverty. The U.S. Census Bureau calculates poverty based on a 50+ year-old study that takes the cost of a subsistence diet and multiplies it by three with no allowance for regional cost of living differences. Thus, California, with a cost of living about 50% more expensive than Texas, has the same poverty level as the Lone Star State. While Alaska and Hawaii are given more generous poverty level thresholds, the contiguous 48 states have the same poverty level.

    Census addressed this issue in their groundbreaking Supplemental Poverty Measure research which also accounted for the value of non-cash government assistance in determining income (in the traditional measure, food stamps–now called Supplemental Nutrition Assistance Program–and Section 8 housing vouchers are not considered income). The Supplemental Poverty Measure also accounts for taxes, out of pocket medical expenses and work-related expenses. These factors cause the poverty rates to soar in the costly Pacific coastal region and in the Northeast and Florida while reducing poverty rates in the low cost Midwest and South.

    When all is said and done, California has consistently seen the highest Supplemental Poverty rates in the nation for the past several years.

    • Chuck ==> Good example of how commonly quoted averages often serve to hide the very information we most need to know.

  19. The distribution of wealth by county in wyoming tells an interesting story if one knows quite a lot else about the state. The statement that wealthy persons have retired to wyoming explains the high income in Teton county only. It is very, very wealthy and votes overwhelmingly Democratic. Most of western wyoming has a large Mormon population and employment in the energy industry. Campbell county in the north central is energy employment–oil at one time but now mainly coal. With so much wealth and income connected to energy it is not difficult to understand why the state, as a whole, voted so heavily against Hillary.

    • K.kilty ==> Yes, the local ground-truths, like that of the mystery spot in North Dakota in the essay, are often the key to understanding — key factors glossed over by the application of averages.

  20. Well presented. Now for the challenge of applying it to temperature data. I always find it amusing when I read that the average global temperature is 15 degrees C. Is there a spot on earth that is perfect because it is average?
    February is the hottest month in Singapore with an average temperature of 27°C and the coldest is January at 26°C.
    July is the hottest month in Los Angeles with an average temperature of 22°C and the coldest is January at 13°C, so a yearly average of about 17.5°C.
    January is the hottest month in Melbourne (my home) with an average temperature of 21°C and the coldest is July at 10°C, so yearly average is about 15.5°C, almost matching the global annual average.
    July is the hottest month in New York with an average temperature of 25°C and the coldest is January at 2°C, so a yearly average about 13.5°C.
    The average temperature in London is 19°C in July and 5 C in January, so a yearly average of about 12°C.
    July is the hottest month in Moscow with an average temperature of 19°C and the coldest is January at -8°C, so yearly average about 5.5°C. Why do people live there? No wonder the Russians might be in favor of some warming.
    With all of these variations, we are then told that the average global temperature has risen by about 1°C in the last 100 years, and a further 1°C rise will be catastrophic. For who? Where?

    • “Is there a spot on earth that is perfect because it is average?” Why should average be perfect? In almost every case average is not perfect.

      Your examination of populations is apt. People tend to be located in areas that are currently productive. If there were free migration of people we could probably adapt reasonable well to changes. Siberia gets more productive, so people move there from areas that have become less productive. However, given our current set up of nation states with very little will to allow immigrants this adaptation seems unlikely to happen.

  21. I have a more fundamental objection to the averaging of temperature data. SB Law is that P=5.67*10^-8*T^4 with P in watts per square meter and T in degrees Kelvin. Doing the math, it takes 3.3 w/m2 to raise the temperature from -30 to -29 deg C. But it takes 7.0 w/m2 to raise the temperature from +30 to +31 degrees C.

    Calculating a global temperature average to try and track changes in earth’s energy balance due to increasing CO2 is patently absurd. All that much more absurd when one considers that the people who do these calcs have enormous compute power available to them. Taking all that temperature data, raising it to the power of 4, and THEN averaging it might produce something useful, and would be trivial for them to do.

    • For the umpteeneth time, they don’t average temperatures. They average anomalies. And they don’t use it to try to track changes in earth’s energy balance. For that you do have to track the temperature of the region radiating. Most of that is not at surface anyway.

      • Nick writes

        They average anomalies. And they don’t use it to try to track changes in earth’s energy balance.

        Not true. They (try to) measure the temperatures of the ocean and from that derive the accumulated energy and hence energy balance.

      • How do they get the normals to derive the anomalies that they then average? Seems a circular argument to me.

      • Nick Stokes ==> I am so happy to see you say “And they don’t use it to try to track changes in earth’s energy balance.”. I say “They can’t use it to….”
        Of course, the whole reason they track any of the global temperature averages or anomalies is exactly to do what you say they don’t do with it — maybe there are some people just interested in numbers….but the consensus position is precisely that rising global temperature trends proves that CO2 concentration increases are causing the Earth Climate to gain energy, retain more energy, thus raising temperatures and adding energy to the climate system (more storms, extreme weather, etc).
        Perhaps you can write an essay for Judtih’s blog setting these guys straight —

      • Nick Stokes June 20, 2017 at 12:38 am
        For the umpteeneth time, they don’t average temperatures. They average anomalies.

        And for the umpteenth time, there is no difference. An anomaly of 1 deg from -30 is 3.3 w/m2 and an anomaly of 1 degree from +30 is 7.0 w/m2. Averaging anomalies is the exact same error as averaging temperatures and for the EXACT same reason

        And they don’t use it to try to track changes in earth’s energy balance. For that you do have to track the temperature of the region radiating. Most of that is not at surface anyway.

        That is EXACTLY what averaged temperatures and anomalies are used for. The press is full of quotes from various bodies screaming about the need to keep earth’s temperature rise below the “dangerous” threshold of 2 degrees over pre-industrial.

      • Nick ==> For heaven’s sake — Mosher says they don’t average any temperature but predict temperature that one would find at unmeasured locations and then just pretend that they have produced an average — you claim that “they” don’t average temperatures at all — they average temperature anomalies — if I asked you “anomalies of what?” you would have to truthfully answer “Anomalies of averaged temperatures.” There is no anomaly without an average to be an anomaly from!

      • Nick Stokes, “… they don’t average temperatures. They average anomalies. ”
        1, Are you saying a particular station’s monthly average is
        The Total of each days Tmax for the month minus the total of each days Tmin for the month divided by 2 and the base period is calculated the same way; and the recorded anomaly is the current calculated monthly average minus the base period’s calculated monthly average ?
        2 Are the grids whether they be 5 X 5, 3 X 3 or 5 X 3 weighted to correct for varying surface areas at differing latitudes when they are averaged?

      • NS,
        In order to compute an anomaly for a particular station, one has to compute an historical average first. Thus, temperatures ARE being averaged, contrary to your claim! If that historical average (baseline) has a standard deviation of tens of degrees, what is the precision of resulting anomalies?

      • Kip,
        “you would have to truthfully answer “Anomalies of averaged temperatures.””
        Absolutely not. That is the fundmental difference that almost everyone misses. You must form the anomalies first and then average. You get a quite different, and wrong, answer if you average temperatures in a region over a period and then take anomaly. That is the homogeneity thing. You don’t have the same stations in each month.

        Yes, people aren’t always carefully about saying that it is an anomaly average. But well-informed people know what they mean.

      • Davidm,
        “That is EXACTLY what averaged temperatures and anomalies are used for. The press is full of quotes from various bodies screaming about the need to keep earth’s temperature rise … “
        They aren’t talking about radiative balance. They are just talking about the consequences of getting hot.

        OLR isn’t simply determined by surface temperature. Here is a map. There is obviously correlation, but it isn’t tight. Australia is not really hotter than Brazil (year-round).

        On anomalies, yes, you could weight them by T^3 to get the T^4 effect. It would just mean something different. Even locally – the annual average would be much more like average summer temperature.

      • Nick ==> Unless you have some way to reverse the arrow of cause and the arrow of time, I do not see how it is possible to… “You must form the anomalies first and then average.” Between you and Mosher, I am beginning to feel I’ve woken up on Bizzaro World.
        According to NOAA NCDC
        “In climate change studies, temperature anomalies are more important than absolute temperature. A temperature anomaly is the difference from an average, or baseline, temperature. The baseline temperature is typically computed by averaging 30 or more years of temperature data.”
        (link = https://www.ncdc.noaa.gov/monitoring-references/dyk/anomalies-vs-temperature )
        This is as I thought, to find an anomaly t6o [to] average, one must first find anomalies by finding the difference between a temperature and its 30-year average as a baseline.
        Can you explain your statement “”You must form the anomalies first and then average.”” in light of this verity clear definition?

      • Just a thought with the baseball analogy that came up earlier. However I know nothing about baseball, so please work with me.
        You have 100 bowlers, you want to see if something is affecting their performance.
        You take their average so far, then calculate an anomaly from their last game – that is the difference between their score in the last game and their average in the season up to then.
        If we just averaged the scores before and after the result would be dominated by the high scorers. But if we take the anomalies then average we get a more meaningful result. We are then looking at the change rather than the absolute amount.

        I am not sure about this but would welcome a comment from Nick to explain if these situations are remotely analagous and it might help me and other to understand.

      • Nick Stokes
        They aren’t talking about radiative balance.

        First of all Nick, you haven’t responded to my point about averaging anomalies being just as insane as averaging temperatures. It is the exact same problem. You cannot toss it off with a few words about getting “something else”. It doesn’t matter if you get something else or not, what matters is that averaging temperatures or anomalies is ridiculous.

        As for not representing earth’s energy balance, sorry, but the profile of emission and absorption from TOA to surface IS part of the energy balance. How it changes affects temperatures where we live. Complaining that energy balance means exclusively energy exchange between earth system as a whole and outer space is a ridiculous interpretation of what I said. Substitute whatever terminology you want, at day’s end the mantra is look, the temps/anomalies are rising, it is because of CO2. The metric is meaningless.

      • Paul Jackson,
        “recorded anomaly is the current calculated monthly average minus the base period’s calculated monthly average”
        Yes. That is how the local anomaly is calculated, prior to spatial averaging.

        And yes, in grid averaging, the cells are area-weighted (basically, cos latitude).

      • seaice
        “would welcome a comment from Nick to explain if these situations are remotely analagous”
        I know even less about baseball, but yes, I think so. Anomalies are useful for focussing on change. But the basic reason for using them is that when you calculate an average over time, the population of sites changes. Comparing the averages of different sets is OK providing they are homogeneous – ie equally (approx) representative of sites on earth. For absolute temperatures, that is not true. So you’d have to worry every month whether you had the right balance of hot and cold places (and you don’t have much control). But anomalies are calculated to be homogeneous. That is, you subtract out the expected value. It’s still possible that
        1. you didn’t get that quite right, or
        2. There are other differences, eg variance (heteroskedastic)
        But you have made a huge improvement.

        An example of the marginal issue is the Arctic. Subtracting the base period means takes out the issue that it is habitually cold. But it doesn’t take out the fact that it is warming faster than elsewhere. That is why Hadcrut, which has had weak Arctic data (improving), showed lagging recent trends, which increased with proper accounting, as Cowtan and Way showed.

      • Kip,
        “Can you explain your statement “”You must form the anomalies first and then average.”” in light of this verity clear definition?”
        Yes. The NOAA definition would have been clearer if they had specified site averages, but the following text makes it clear that that is what they mean. Calculate the site anomaly, and then aggregate. Sometimes it is slightly fuzzier, as when they calculate a cell anomaly rather than site, arguing that temperatures within a cell are homogeneous enough to average. But the text says they don’t even do that. Here is the diagram they use to illustrate why anomalies are used:

        Here is how they explain it:

        Using anomalies also helps minimize problems when stations are added, removed, or missing from the monitoring network. The above diagram shows absolute temperatures (lines) for five neighboring stations, with the 2008 anomalies as symbols. Notice how all of the anomalies fit into a tiny range when compared to the absolute temperatures. Even if one station were removed from the record, the average anomaly would not change significantly, but the overall average temperature could change significantly depending on which station dropped out of the record. For example, if the coolest station (Mt. Mitchell) were removed from the record, the average absolute temperature would become significantly warmer. However, because its anomaly is similar to the neighboring stations, the average anomaly would change much less.

        Of course, there is averaging within a site. Sub-day readings are aggregated to daily average. Those are averaged to a month (again, a bit of an assumption that within the month they are homogeneous, but accurate work will correct for this if there are missing values). That is part of the process of forming anomaly. But that has to be done before you start aggregating stations.

      • Nick ==> Well, at last. FIRST, there is the averaging, then the averaging of averages, and maybe one more averaging of averages…..and THEN there is subtracting the long term average of averages (whether it be a site average, less bad, or a cell average, more bad), which is called the 30-year baseline — to find the anomaly for that site/cell….

        So there is a tremendous amount of averaging and averaging averages before one gets anywhere near “spatial averaging”.

        Glad we got that cleared up (?).

      • Nick, I am not sure it is possible to know less about baseball than I do, but I basically understand cricket, so I will accept the possibility. I fairly recently learnt the word “heteroskestic” so I am always please when it comes up.

        It seems to me that we would be dealing not just with our 100 baseball players, but a changing population of players. Some drop out and new ones come in. Using anomalies we can deal with this in a sensible way. Simple averages would make no sense. We would not be certain that we were not at any particuar point over-represented by good bowlers or bad bowlers. Does this make sense, or is this a mis-representation of the issue?

      • Nick ==> Ah, yes, so in this specific example, we are already using annual averages for each site…..and finding the anomalies of each site from its 30-year average of annual averages.
        I’ll write my thoughts about the sins of averaging averages next time.

      • “For the umpteeneth time, they don’t average temperatures. They average anomalies. ”
        Using the school class example, for the whole school, you would use anomalies from the average for the year level in a base period to avoid the variation in number of students in the different year levels. This still does nothing about a change in average due to changes in demographics and doesn’t make the changing average useful in estimating class height of the next generation.

      • Robert ==> True that. The pretense than a changing metric over time means something in particular is rampant in today’s science circles. Epidemiology has been ravaged by the insistence that this is true — that a changing “average” over time — either the whole average or an anomaly — MUST mean something. This idea is simply not true. A changing metric (which always makes a trend) is just a changing metric until the system producing the metric is understood so that the change can be evaluated.
        I had fried chicken for lunch.

      • I used an example a while back of a newspaper report on research that showed Australian women were getting fatter on average. About 2 kg. A break down in the change in population of groups based on weight and age showed that only the over 110 kg in the 45-60 group had a significant increase in number, doubling from the previous survey 12 years before.
        The interpretation using the average was that women in general were piling on the pounds. The more detail version showed huge improvements in saving morbidly obese women from death.

      • davidmhoffer wrote: “And for the umpteenth time, there is no difference. An anomaly of 1 deg from -30 is 3.3 w/m2 and an anomaly of 1 degree from +30 is 7.0 w/m2. Averaging anomalies is the exact same error as averaging temperatures and for the EXACT same reason”.

        This isn’t much of a problem, David. Call T the mean global temperature and dT the difference between a local temperature and mean global temperature. In your extreme example, T = 273 K and dT is +/- 30 K: If we use T to calculate outgoing radiation W:

        W = oT^4

        If 50% of the Earth were T+dT and 50% were T-dT:^4

        W’ = 0.5*o(T+dt)^4 + 0.5*o(T-dt)^4
        W’ = oT^4 + 6*oT^2t^2 + ot^4

        The proper average W’ is slightly bigger. Calculate the ratio of these numbers:

        W’/W = 1 + 6*o(t/T)^2 + o(t/T)^4

        T is obviously much bigger than t, so we can ignore the last term. In your example t/T is about 1/9 and 6*o(t/T)^2 is 2/27, a 7% error.

        In the real world, T is 288 K and maximum dT is about 13 K over oceans (70% of the surface). Except for Antarctica and some extreme locations during winter, maximum dT over land is usually 20 K. The average dT is less than these extremes. Think about averaging the error term 6*o(t/T)^2 over the whole planet over a whole year. Maybe the average dT difference from 288 K is 10 K. I get an error term of about 0.7%.

        So the problem of using (average T)^4 in place of the correct average(T^4) is fairly small.

        In reality, most photons reaching space are emitted from the atmosphere high above the surface. Temperature variations with latitude are smaller there, but variation with altitude is bigger. You need the radiation module from a GCM to do these calculations properly. And we have satellite measurements to check these calculations. For the purposes of WUWT, the error in using mean global temperature to discuss mean global emission of LWR is almost always unimportant.

    • Dmh,
      That is very good point, one I have also used.
      When you think about it, you wonder if such slack can be, and is, accounted for adequately in GCMs.
      Geoff

  22. You are so right Kip on statistical averages. One can look at employment, durable goods, GDP and on and on. The tales can be told from many perspectives dependent upon positions taken on such. Ultimately, the real numbers will rule in the end.

    How would sea ice averages be viewed if we took a starting point of 1974 as an example?

    Just sayin….

  23. Kip,
    I have one small quibble with an otherwise excellent article. You said, “…believed by their creators and promoters to actually produce a single-number answer, an average, ACCURATE to hundredths or thousandths of a degree or fractional millimeters.” I believe that “precise” would be a better choice of words than “accurate.” The issue of the accuracy of the average is another question entirely, more related to the sampling and interpolation protocols.

    • Clyde Spencer ==> I believe the claims are both to accuracy, with unsupportable tiny error bars, and to precision, to physically-impossible precision — example: Global Mean Sea Level change precise to tenths of a millimeter.

      • “… to physically-impossible precision — example: Global Mean Sea Level change precise to tenths of a millimeter”

        IIRC, there’s a convention problem with things like this. Let’s suppose that sea level rise at a given set of tidal gauges or satellite is accurate to 0.6mm. Ignoring the fact that one is measuring Apparent Sea Level change whereas the other reports Eustatic Sea Level change, pragmatically one can either report to the nearest mm and lose accuracy or report to the nearest 0.1mm and imply more accuracy than exists. Rigorously one would have to specify error limits (one sigma? 3 sigma? something else?). That’s messy and you’d just end up defending your error estimates.

        I think there’s more to the analysis than that, but I don’t recall the details. Anyway, the convention is to report enough precision not to lose accuracy. Seems to me probably the least bad resolution?

      • Don ==> I am thinking more specifically of the satellite derived MSL (mean sea level). If one checks the expected error bars on the satellites, each newer version, the latest one was expected to be able to read sea level to within +/- 2 or 3 mm. I have these figures somewhere preparing for an essay on the topic, but that’s abut right.. So from an instrument that takes multiple reading at multiple locations at multiple times….to an bar measured no closer than 2 or 3 mm — they derive, somehow, a Global Mean to tenths of a mm?

      • Kip: “the latest one was expected to be able to read sea level to within +/- 2 or 3 mm” A couple of cm, not mm I believe. There’s apparently that much uncertainty in all satellite orbits — I think due to the fact that drag is neither constant nor entirely predictable. (Don’t push me [too] hard on in-track vs transverse error — I don’t know the answers) Add to that other uncertainties — e.g. If you’re trying to measure radar returns to mm accuracy, you probably have to worry about every detail of the signal path like where the antenna is with respect to the satellite center of mass.. Then there’s variable ionospheric delay.to worry about. And waves on the ocean surface. And air pressure. And … You get the idea.

        OTOH, they are able to make a lot of measurements — 20 per second if memory serves. They are using the same instrumentation for all measurements. And, conceptually at least satellite passes repeat over “exactly” the same spot every few hundred passes (254 if the internet isn’t lying to me) which means you can get millions of was-is_now pairings at the same “spot” at 10,20,30 … day intervals that may or may not be free of instrument drift (or drifting at known rates which is probably just as good if you can compute corrections.

        You’re proposing to analyze the errors in the values reported from that setup?

        Seems challenging.

      • Don K,
        There is an old philosophical observation that one can never step into the same stream twice. Similarly, a satellite can never pass over the same spot twice because of waves and tides.

      • “There is an old philosophical observation that one can never step into the same stream twice” Clyde -Ignoring the poetry/philosophy, there’s a good pragmatic reason to worry about being very close to exactly the “same” latitude and longitude when comparing observations. It turns out the sea surface is rather far from being “level”. If I haven’t misplaced a decimal point, the worst case is in the Eastern Indian Ocean where the slope of the sea surface might approach one part in 10000. That is to say that if you’re worried about mm accuracy, you need to be within 10m of the same place.

        Tides? Oh yeah. Tides. Thanks for bringing that up. They need to be corrected for. We know tides with reasonable precision I think. Surely within a few cm? but we’re worried about mm or microns (0.1mm = 100 microns, right?) 100microns = 0.01 cm.

      • Kip. It occurs to me that I’ve failed to make the point I’m concerned about explicitly. What I’m trying to say is that the physical systems involved in measuring sea level change at the sub-mm level from satellites are extraordinarily complex even by “climate science” standards. While trying to work out the error budget would be a great intellectual exercise, I’m not sure that it is doable. Certainly I can’t do it. The number of people who can may well be zero.

      • Comparable to those problems, and the even greater uncertainties and inaccuracies, of trying to “measure” the mass differences caused by assumed ice changes over Antarctica and Greenland. When the same satellite pair is chasing each other over the globe overhead is assumed to be properly affected by different masses below the thousands of meters of not-constant-density ice, and varying rock weights under that, across the two unequal heights and depths of the never-measured invisible rock base miles beneath the ice!

        But what the GRACE satellites presume is that they can determine the CHANGES in ice mass (changes in ice depth) from year to year by assuming the rock depths below are moving the vertically way the scientists assume the rock moves.

      • Kip, Dr. Lubchenco, former NOAA administrator, once assured me that the TOPEX satellite provided “instantaneous” data concerning sea level changes.

        Nevertheless, according to Impact of Altimeter Data Processing on Sea Level Studies, the satellite has suffered instrumentation drift that requires modification using surface tidal stations as reference. Also, in the time it takes to scan an area of the ocean, waters have moved from place to place, leaving the measurements to the same tidal variations as the terrestrial measurements.

  24. “The high temperature today was 104. That is 23 degrees warmer than the normal of 81.” You can hear that on TV any day. First of all there is no such thing as a Normal temperature. Who can say what is normal? Nobody. The number they are calling normal is in fact an average of the high temperature on that date for the 30 year period ending at the beginning of the current decade as calculated and published by The National Weather Service. It is a short term average which is somewhat meaningless from a climatic point of view. Such is life.

  25. While we’re on this subject, can anyone identify what useful information is conveyed by averaging the results of different computer models? Let’s say I know almost nothing about hockey (a completely true assumption) but that I’m given $100 to bet on an upcoming hockey game. Knowing nothing about hockey, I look online and see that “X” sports analyst is predicting team 1 to win by 2 goals, “Y” sports analyst is predicting team 1 to win by 1 goal, and “Z” sports analyst is predicting Team 2 to win by 1 goal.

    The average of the three predictions is team 1 by 2/3 of a goal, and I see that the spread is 1 goal, so I bet on team 2 to beat the spread. But, as far as I can deduce, the information conveyed by the average relates to the predictions of the game – not the actual results of the game, which is a unique event not amenable to a mathematically probabalistic analysis. What I’m really measuring is the expected value of a next sequential prediction by another analyst. I’m not measuring anything related to the actual event, i.e. the singular game to be played in the future.

    That I resort to taking such an average highlights, not my expertise in hockey (never that) or even my mathematical aptitude since I’ve come up with completely useless information. What it illustrates is my ignorance; because I lack the ability to look at the teams and their past performance and make my own judgment, I have to resort to a simple average of predictions of people I presume to be experts.

    So given all that, what am I to conclude about the “expertise” of the IPCC when, faced with widely differing scenarios presented by different models developed by different groups of people with different assumptions about the inner workings of the climate, the IPCC can’t just pick the best and instead just averages the results?

      • I had read that a while back, and I think we’re talking about different things. If I have a single model of a nonlinear dynamic system, and its output produces chaotic results that vary significantly from one set of initial conditions to another, it might be argued that the average results of the runs of that particular model represent some likely outcome on the assumption that all possible initial conditions are equally probable (although I’m instinctively skeptical of even that).

        But once you start dealing with different models, with different assumptions used in each model (or tuned to different observations) then I can’t see any analytical reason to average the results of those disparate models together, beyond “Hey, look at the pretty graph that we’ve simplified for all you silly rubes.”

      • Kurt ==> “…although I’m instinctively skeptical of even that” in that, you would be very justified. Averaging chaotic results is a fool’s errand — it does not produce anything approaching a valid prediction. Any “ensemble” of model runs only reveals the boundary conditions (assuming enough runs) for that model but not for the physical system being modeled
        Given that, averaging the results of different model ensembles no more produces valid predictions than averaging the results from 100 gypsy fortune tellers. .

    • Kurt

      I’m smart enough to think I understand your analysis, but not smart enough to know if it is correct. However, it does seem to me that it would apply equally well to the stock market, to most economic analyses, and to a wide variety of other things. Are you trying to destroy civilization as we know it?

      • I’m not that grandiose. I have a hard enough time trying to destroy the weeds that keep infesting my lawn, so I just try to smite the little stuff.

        Not sure what you mean by the “stock market” or “economic analyses.” The DJIA is an average share price of a specified basket of stocks, presumably indicative of the market as a whole, so the question is whether the selection of what’s in the basket is a sufficiently representative sample of everything, like polling people and trying to get a demographically representative sample.

        My issues relate to the narrow question of what you are sampling. If you assume that the DJIA is indeed representative of the entire universe of stocks in the NYSE, then you should be able to take another representative sample of different stocks and come up with the same average number. If you assume that your polling methodology is accurate, a new poll should produce similar results even if the people change. The average relates to the common feature of your samples

        If I sample results from a single model, I can at least get my mind around the idea that the results show the expected behavior of the theoretical climate system that the model represents, which can be compared to reality as a kind of benchmark for whether my model accurately simulates the climate’s behavior. But if I start averaging together the output of different models, each model representing different theories of how the climate system works, that average doesn’t represent any useful metric. I’m sampling from among different, mutually exclusive theories about how something might work.

        Let’s assume for example that the “average” behavior of the models happens to line up very well with the way the climate actually behaves in the years subsequent to the model runs, but none of the individual model averages do. If individually, none of the models got it right, then I can’t have confidence in the set of assumptions made by any individual one of the models. If I can’t have confidence in the ability of any individual model to accurately simulate the climate, what is the point of sampling them to begin with?

        The polling example may provide the best analogy. Real Clear Politics averages multiple polls, each with different sampling and weighting methodologies, but the polls often have widely disparate results that can’t all be true. Averaging them together shouldn’t give you any better information about the opinions of the electorate at any given point in time. Possibly, if all of them are moving in the same direction over time, you can infer that, regardless of which poll is most accurately samples the electorate as a whole, one candidate is gaining steam and the other is not. But the average tells you nothing, and if one poll contradicts the trend of all the others, how do you know that that one poll isn’t the one that is doing the correct sampling, without exercising independent expertise as to why it should be treated as an outlier?

      • Kurt — I was thinking of stock market analyses, not the DJIA/S&P/Nikkei per se.

        But you are aware that the DJIA is a moving target? Companies are added and subtracted. I think that the only remaining member of the original 1896 DJIA is GE. Most (all?) of the rest are not only no longer in the index. They are mostly defunct.although some still exist after a fashion e.g. US Rubber ended up as a small part of Michelin.

      • “But you are aware that the DJIA is a moving target? Companies are added and subtracted.”

        I’m not sure that’s necessarily a problem. Two different opinion polls taken in consecutive weeks will sample different people, but that doesn’t mean that each poll is not representative of the entire population at the time it was taken,

    • Kurt,
      It seems to me that with an ensemble, one can logically only have one ‘best’ result (barring duplicates). Thus, averaging the best result with the inferior results will give one something in between the best and worst. Is that useful? Probably not as useful as the best result. By not determining which are the good and poor results, no insight is gained on what contributes to the quality or utility of the different models.

      • But that’s true with calling a coin toss heads or tails, as well. Only one result is actually going to happen, but that doesn’t mean that the “average” of the group of all possible outcomes isn’t useful.

        For simplicity, assume that health insurance only insures against lung cancer. Insurer’s know that a smoker will either get lung cancer , or not. They assess the probability of a particular insured getting cancer in the policy term (usually pretty low, even if you are a smoker) and multiply it by the expected cost if cancer is contracted, to arrive at their “expected” cost. They add a “premium” on top of that for their profit, and bill the insured. The “expected cost” in this example will never happen. If a person gets cancer, the costs to the insurer will vastly exceed the “expected cost.” If the person does not get cancer, the costs are zero.

        If the insurer makes this calculation for a very large number of people, accurately assessing the risks of each person and charging them the appropriate amount for their policy, then the insurer can probably pre-calculate how much profit it will make for the whole group, but for each insured, the proper, “expected” result does not correspond with the actual outcome.

        Now Kip, above, says that averaging chaotic results retains no useful information, and my gut instinct of agrees with that, though I think it’s more a question of how you assign a respective probability to each of the sets of initial conditions. I think the modelers treat them all as equally probable, but that has to be an assumption made of convenience (i,e. it’s all they can do) rather than being a reasoned assumption.

      • Kurt ==> Must remember not to confuse random results with truly chaotic results (chaotic in the sense of Chaos Theory).
        A fair coin toss has a nearly perfect 50-50 probability ration – not because of averaging, but because of probability. The results of any actual series of coin tosses has a semi-chaotic result, as the imperceptible initial condition effects, air currents, landing surface imperfections, etc may skew results chaotically — this effect will be very small but there none the less.

  26. Not that it matters, but “An interest of mine, as my father and his 10 brothers and sisters where born on the Pine Ridge in southwestern South Dakota, the red oval.” “where” surely should be “were”.

    • Don ==> Absolutely correct — I’ll fix it — danged auto-correct spellchecker! (and lazy-eyed copy editor).

  27. Note on the Normal Curve/Gaussian distribution. Statisticians love the Normal Distribution because of its nifty mathematical properties. But AFAICS hardly anything on the real world other than (usually) the Standard Error of the Mean actually distributes normally. If I Recall Correctly, the poster child for the normal distribution — the Intelligence Quotient curve actually required a substantial adjustment at one point to make the numbers coming out of testing more closely match the theoretical distribution.

  28. Here so this June we have had 12 days colder than normal and 7 days warmer with the months avg temp so far 0.2c warmer than normal.

  29. Two general comments:

    1. I think it’s OK and often useful to add apples to oranges. You just need to remember that the units of the result will be “fruit”. Not hard in that case. But sometimes the distinctions are a lot more subtle.

    2. Not all useful numbers have a sound physical basis. Example — Sunspot numbers. If I understand them correctly, they are a rather complicated index, not a count. That’s why we never see sunspot numbers less than 10. Nonetheless, they do seem to correlate to solar activity and are said to be useful. What about statistical operations on sunspot numbers. Are those operations well behaved? Meaningful?

    • Sunspot numbers are an asymptotic proxy for solar activity, i.e., the sunspot number has a lower limit of zero while other measures solar activity are decidedly not zero. Using stars other than our own, Astrostatistical Analysis in Solar and Stellar Physics concludes, in part, “We find that incorporating multiple proxies reveals important features of the solar cycle that are missed when the model is fit using only the sunspot numbers.”

  30. It all gives me brain-ache yet still its a ‘brain-worm’ I cannot shake off..
    There’s something pretty epic in it though – an ‘instinct’ or ‘intuition’ tells me so and its how people interact with these ‘averages’.
    What’s got me going is the flood discussion.
    Lets say A River, somewhere anywhere has an average flow-rate or (more easily measured by people) an average depth .
    That average could be decades or centuries long.
    Lets look at the River Mild and its been 5 feet deep under the bridge at Boringville for the last 250 years. Fine.
    But because we’re doing averages, that means and by definition, unusual things happen.
    So, after some cold wet weather on River Mild’s catchment, the sun comes out, temperatures soar and thunderstorms break out. Perfectly feasible as the T storms are fed by all the recent wet weather.
    And suddenly, on a Tuesday and 3 hours after midnight, the River Mild rises by 20 feet and its flow rate goes up 30 fold. Because of the T storms.
    Because lots of folks thought it was really a River Mild, they built houses next to it. They used ‘The Average’

    But for 6 or 7 hours it turned it into River Wild.
    It cut a vast swathe of muddy devastation through Boringville, destroyed homes, gardens, fields and a great deal of the people themselves.
    Then for the next 250+ years it returned to being River Mild.
    That single one-off event damaged a great number of people but when Ivory Tower Dwellers, at public expense come to work out The Average, that (flash) flood completely vanishes.
    So the people rebuild their houses and the High Street at Boringville and then what happens?

    10 years late another freak flood arrives. Similar thing to playing The National Lottery.
    (Not unlike Carlisle, on the River Eden. In Cumbria.) (The flood, not the lottery)

    So where are your averages then. What use did they do you in either Carlisle or Boringville. Was there *really* any point to working them out. Are they not just even more faerie counting?

    Did they lull a false sense of security or, as they’re equally validly for averages, a false sense of doom and disaster? As per climate science right now.
    How do you/me/anyone connect those numbers with The Real World and Real People.

    Maybe we could start by opening the doors and windows of a few Ivory Towers and kick the residents out.

    I think I’ve worked it out here and now. There are too many Moshers and the like.

    Over to you Donald……..

    • Nassim Nicholas Taleb writes entertaining and possibly insightful books about extreme events and our inability to think clearly about them — The Black Swan, Fooled by Randomness. Possibly worth reading if you haven’t encountered them.

    • Peta,
      Your story illustrates why skewness and variance need to be provided along with a mean. Also, providing mode and median is informative. Focusing on mean alone is either being deceptive purposely, or illustrating statistical ignorance.

  31. Kip,
    I really don’t see the point of this loquacious dumbing down of already elementary statistics. Especially if inaccurate – the mean is an average, but the mode and median are not. They are all measures of central tendency.

    I really can’t see the point of your second example. Of course the average, or any other summary statistic, like median, can’t tell you all about the dataset. That is why the various authorities present all that other information that you quote – all subset averages. I think the Dow is a useful average to think about. It is very widely quoted and used. It won’t tell you whether copper stocks are booming, or taxis have collapsed. Everyone knows that it won’t, but they still find it useful.

    Your first example does again illustrate the use of averaging as an estimate of population mean, though you don’t seem to see it that way. It probably isn’t an average being sought, but a bottom quintile, or some other summary statistic, but the principle is the same. What you want is an estimate of the average of the population of boys for whom the bar is intended. The data for Mrs Larsen’s class is the sample you have available, at least initially. And you do need that summary statistic. The boys may vary, but the bar can only have one height.

    Sampling is a science, and you need to get it right. As you say, it might happen that her class is not representative. That is not a problem with average calculation; it is a problem of sampling. And there are remedies. Maybe Mrs Larsen’s class had an unusually large number of Hispanics, and maybe they tend to be shorter (I don’t know that that is true). So you re-weight according to what you know about the population. It is that knowledge that you need to design the bar, plus someone who actually knows about statistics.

    • Nick writes <blockquote<Of course the average, or any other summary statistic, like median, can’t tell you all about the dataset.

      Or indeed what a changing average actually means.

      • “Or indeed what a changing average actually means.”
        Yes, you have to figure it out. Like when the Dow drops 2000 pts. It’s not obvious why, but plenty of people are going to be curious. It matters to them.

      • Sorry we only model to the market level. We dont go down the the stock level and when we try, its not a great result. But I’m sure knowing what the market in general is doing will give you enough information to help you choose your portfolio.

    • Nick Stokes ==> I’m sorry, in my first part in post images of the common everyday definitions of averages, like this one:

      You may argue with it all you like, and decide that “statistician’s special language” trump’s everyday English and that what you learned in Stats classes nullifies what everyone else learns in their K-12 mathematics class.
      I am not a statistician and I do not write for statisticians but for the readers here.
      I write essays here with the intention of helping readers gain a better understanding of the topics they see in their everyday lives and exposing some of the abuses of Science that appear in their news outlets.
      Read the other reader’s comments here — try to understand that not everyone has a university degree that included deep statistical theory and practice — in fact, as you know all too well, very few people anywhere understand statistics beyond high school levels — including most scientists.
      So — elementary as it may be, essays like this help people become smarter — to read smarter — to listen smarter.

  32. A well written essay with many good points, but, unless you are arguing semantics, the sentence “The Earth is not a climatic area or climate region, the Earth has climate regions but is not one itself.” is false. All of the “climate regions” of the Earth are connected and continuously exchanging energy and are, on average, one climate system.

    I’m not trying to answer this question but just trying to clarify. My English is not so good so I beg your pardon in advance.
    If we think at “Local Climate” as a class, could we say that “Global Climate” being a SuperClass of LC is a LC Class it self? Let define the Members of LC Class: “Temperature, Air pressure, eccetera”. Can we find such a members in the GC Class?
    If not then GC is not a Climate Class but a name for the avarage Climate Class instances!

  33. In other words. An engine is a system witch part are well connected and exchanging energy but can we build an engine of engines?

    • Mariano ==> I like the engine analogy — not perfect, but I like it. All the parts of an Engine are Engine Parts — but the Engine itself, though made of the same materials, is not itself an Engine Part.

      • Kip ==> thank for your comment. I don’t say that Climate is like an engine (I’M not qualify to said that) what I’m saying is that we must know if Climate can be seen like a motor or like a society. A society is not only a sum of their individuals but has its own law that modify the society it self.
        I can’t see such a difference in the Climate debate. As I said, I can’t afford such a complex field with my bad English, but I would like to read someone saying something about close or open systems, for example.

      • Kip,
        Analogies don’t have to be perfect to be useful. Indeed, if they were “perfect” they would be equivalent to the original statement and would lose the utility of viewing the problem from a slightly different viewpoint. The optimal analogy is similar enough to the original statement that no one can claim that it is unrelated or a non sequitur, but different enough that it can break down prejudices or biases that are interfering with someone seeing the essence of the problem.

  34. Hmm. Nice one

    In fact this misuse of statistics is in fact part of a wider almost philosophical problem that lies behind nearly all of the problems the not so modern mind seems to have in dealing with the complexity of life as it really is, rather than in the idealised and simplified pictures that are all we seem capable of.

    Science itself is just such and idealised and simplified picture. And there will always be a compromise between ‘idealised and simplified to the point of error’ and ‘so complex we can’t compute it anyway’.

    Unfortunately my message, that in climate science these two areas overlap massively, is unwelcome by alarmists and skeptics alike.

    Everyone wants to seek out the One True Cause of climate change.

    The message that they never will, is not desired.

  35. Kip

    An excellent post and a good reminder of the limitations of using averages for analysis.

    I do have a quibble but it may add to your points. You have used Household and Family incomes interchangeably. The Census Bureau has a very precise definition of both terms. There are about 125 million Households and about 80 million Families in their reports. When using either term the results will give you different data, much like mean versus median and nominal versus real. Even the word “income” can mean different things, market versus aggregate.

    To reinforce your point, when looking at the data for income over a 50 or 60 year period, there are changes in demographics etc that can alter the meaning of any comparisons over decades. For instance, the growth in real income of a family with two earners has been greater than those of single households. Why? Because there is a positive correlation between marriage rates and education and thus income. Also, there are many more Households with a single individual today than 50 years ago. The proportion of Single Mom families has grown substantially and the disparity in income between that unit and the two income earner families has grown with it. The real story sometimes is down in the weeds and each piece has to be dissected to understand what is really going on.

    • cerescokid ==> I raised this issue some time ago in my What Are They Really Counting?
      Especially when they are offering averages of huge amounts of data (public surveys are the same problem — what exactly did they ask?).
      All-in-all, the Median Income example serves to show the effect of averaging disparate information often hides what we really need to know.
      In our “sound bite era” one has to dig dig dig to get at the real meat of a story.

  36. If you really want to have fun, try having a discussion with an ‘average’ person about percentiles. I once tried (and failed) to convince a young Physicians Assistant that I could not be in the 100th percentile.

      • Really, isn’t the 100th percentile the value equal to the maximum value in a distribution?” Phil.

        If we were talking about intelligence, he would have to be smarter than himself to be in the 100th percentile.

  37. Good observations, but I wish you could have been as detailed in illustrating the problem with “climate” data as you were able to illustrate with income data.

    One simple example I sometimes use to illustrate the uselessness of “average temperature” is to talk about weather patterns in my region (central Arkansas). Weather sources will routinely report that the “average” temperature for a particular day is such and such a number. But in reality I would expect it to be bi-modal, especially in the warmer months. We are in a region where the weather is determined by fronts “passing through” the state. On either side of the front the temperature will vary significantly. So one day the high is, say 90, and the next day it is 80. “Statistics” will say that the “average” temperature “for this time of year” is 75 [85], when in fact it is almost always higher or lower. “Mode” is a better measure of central tendency, and would show the pattern to be bi-modal.

    Someone upthread gave the example of Oregon’s “average rainfall.” Again, meaningless, since the regional climate is so different on either side of the Cascades. Climate is at best a regional concept, and even then simple averages can be misleading because even within a well defined climate region averages will vary over time because weather patterns (and long term changes in weather patterns, aka “climate”) are dynamic.

    I’m no expert on climate “models” but do they even try to reflect the dynamics of weather/climate change or are they essentially static “models” for discrete periods of time?

    • blcjr ==> If I use too many climate examples (even any, really) the result is often just a bunch of knee-jerk reactions from the Climate Warriors all spouting Mandatory Climate [Consensus or Skeptic] Talking Points.
      Many people get distracted from the main point — in this case how averages can hide and obscure information — and focus only on defending their favored position on climate science.

      • We have had the recent example of a drought in California coupled with floods in the south of England. Average global rainfall may not have changed a whit, but the climate effects on human comfort were rather severe.

  38. One of the more exotic types was the geometric mean. Here’s an example I picked up reading articles by Isaac Asimov in “Fantasy and Science Fiction” magazine. He was comparing the size of humans to blue whales and mice. Are we bigger or smaller than the average mammal?
    Here are their average masses:
    10^5 kg blue whale
    70 kg human
    3*10^-2 kg mouse

    An arithmetic mean would give just about 1/3 of 10^5 kg- most mammals are WAY below average in
    mean mass. A more reasonable comparison is the geometric mean, which would be
    (10^5 kg*70 kg* 3*10^-2 kg)^0.3333333 which gives 59.44 as the geometric mean. Humans are bigger compared to mice than blue whales are compared to humans.

    • Alan McIntire ==> As a teen, I had a collection of early SciFi mags, over 500 of them, all read and re-read.

  39. When looking hard at renewable energy it occurred to me that in programming a computer, in calculating the cost and benefit of an aircraft design and indeed when looking at putting windmill on a grid, the income is derived from the average performance of the engineered solution, but the cost is dominated by worst case provisions.

    Perhaps that is a subject worthy of an essay.

    (anyway its impact on renewable energy is massive: the worst case of renewable energy is its generating nothing and the cost of covering for that exceeds all the value in the renewable solution).

    • Leo ==> I’d love to read it. For wind, it seems there are a lot of breakdowns that are very expensive to repair, so expensive that in many cases, the windmill is simply left out of service — not to mention the well-publicized catastrophic failures. We see almost nothing about solar failures, though they must happen — wiring close to the ground or on roof-tops, different repair scenarios. Does anyone know when one out of 20 solar panels fails on their roof? Lots of interesting questions.

      • IIRC (and it is a somewhat vague R!) domestic panels are wired in series so any one failure is seen as complete. Hence why it is so important that no panel is in shadow e.g. from a chimney or a TV antenna.
        Whether larger installations are in series/parallel groups or actively connected so that they appear such I don’t know.

  40. Paul Krugman walks into the bar, Cheers …

    Per capita bar patron income goes up. Income inequality among bar patrons gets worse. Median net worth is higher. The total amount of taxes paid by bar patrons increases. The likelihood a randomly selected patron is also a contributor to the New York Times opinion columns rises astronomically. The chance that a randomly selected patron voted Republican in the last presidential election decreases, slightly. The average “carbon footprint” among bar patrons rises measurably. The amount of hot air expended in conversations among Cliff, Norm, Fraiser, and now Paul contributes to global warming by a comical amount …

  41. One of the things that I think is important in any discussion of averages is what is called the “Flaw of Averages.” Boiled down simply, it is that while averages may be used to describe the population the data describes, all too often no single individual data-point will ever fit that average profile. Particularly if more than a single descriptor is used.
    See this article: https://www.thestar.com/news/insight/2016/01/16/when-us-air-force-discovered-the-flaw-of-averages.html
    So, the question is simply this: the average can be determined, but does it really say anything at all if it is used in an effort to apply that average to what is actually occurring?

  42. Chaotic dynamical systems tend to produce distributions that are the “opposite” of normal distributions, that is, they are heavily weighted at the tails. Averages are least applicable to such systems.

  43. The other day I was listening to some blather about the Big Bang being preceded by (even though there is no time yet ???) a singularity that was, among other properties, infinitely hot. But temperature is a measure of average kinetic energy. How can a singularity have an average in the form of temperature? Worse, kinetic energy implies velocity, and velocity implies space over time. I thought those had not “unfurled” yet.

  44. Allow me to inject a curious, yet distantly related fact, into a most enlightening discussion:

    Jan Kareš from the Czech Republic did 232 pull-ups on the 19th of June in 2010, establishing what is believed to be the world record for this exercise. My max pull ups ever was 15, some multiple decades ago.

    Has anyone determined whether Thursdays in Maharishi Vedic City, Iowa have gotten warmer over the past ten or so years? Then we have to ask, “Warmer how?” … right? — “warmer” at what particular time of day?, … are we talking rainy days?, … cloudy days?, where exactly — under a tree – WHICH tree?, ten feet above the ground?, ten and a HALF feet?, … who measured it? … using what sort of instrument?, … was this person patient enough to use the measuring instrument competently? … was he she sober? … etc.

    … seems to be somewhat elusive.

  45. Good article, as far as it goes. However, there are two economic factors that also need to be considered

    1. Number of people per household has been shrinking. So, income per person is growing faster than income per household.

    2. The quintiles are not made up of the same households, from year to year. Families frequently move to different quintiles over time. I personally have been in all five income quintiles at various points in time.

    • David ==> You realize the the essay is not about incomes?
      Nonetheless, that Median FAMILY income at the lowest levels does not improve over such a long time period, in real dollars, means the poor are getting poorer, even if many, like us, escape to higher economic status.

  46. 3 inches is about 5% of the height of the boys. Maybe a better analogy would have been taking a trend of 1/10th of an inch per year and using it to justify a model that says it should be built 10 inches higher for future generations (even though the change in the past 20 years has been less than predicted).

  47. EPILOGUE:
    Great discussion, thanks to all who chimed in, even those that I did not answer directly.
    The Deep Numbers folks added a bit of confusion to the mix, some of which got sorted out.
    Was happy to see some good examples from other fields and perspectives.
    Thank you all for reading here,,,,
    Next time, Part 3…
    As always, if you still have questions you want answered, you can email me at my first name at the domain i4 decimal net.

  48. A couple of points:

    1. For most 6h graders, a head is not a foot, but with only 9 inches of clearance, I expect some of Mrs Jones’ boys to hit their heads on the ceiling.

    2. You Median Income Reality chart still has a problem in that, even if the lowest quintile had 10% gains while the top had ‘only’ 5% gains, the scaling still would make it look like the top was coming out (unfairly) ahead. I suggest a log scale, which would show gains (or losses) in percent (apples to apples) terms.

    • Bob ==> eGads! You are, of course, absolutely right — I failed to subtract the top-of-head to shoulder bit before adding in the arm length! Luckily — no one else noticed.
      Good maths eye, congratulations.

  49. I will try one more time to introduce the expected value or best estimate concept as a good tool to analyze uncertainties in climate science. A prediction of a future condition, such as temperature, is an underdetermined problem for many reasons including incomplete databases and flawed GCMs. This kind of problem is ideal for application of the triangular probability function to calculate best estimates (Figure 1). This is not rocket science, but it might provide boundaries for common sense solutions.

    For the triangular distribution above, (A + B +C)/3 is the probability weighted average of the function, that is, the expected value or best estimate for the event represented by the distribution of x. B is the mode of the probability function, the most frequently occurring value. For a normal distribution, mode = mean = best estimate = expected value.

    A single value prediction is not an expected outcome from any analysis. A prediction will be a range of values with a high probability of including the actual value. Assume that the extreme values A and C above can be estimated. For example, A might be a climate alarmists’ temperature estimate and C might be a temperature estimate heavily influenced by solar activity. The value of the mode B is unknown but must lie between A and C.

    To calculate the best estimate of a highest temperature, the values A and B are equal, which results in a positive-skewed probability distribution function from which the highest possible best estimate can be calculated. To calculate the best estimate of a lowest temperature, the values C and B are equal, which results in a negative-skewed probability distribution function from which the lowest best estimate can be calculated. These calculations determine a range of values in which the best estimate of a predicted temperature should lie. The range could be shortened with larger databases and better methodology.

    The takeaway: Cost-benefit analyses for determining appropriate environmental policies must include the consequences of both a warming earth and a cooling earth. Until the possible range of estimates narrows sufficiently to result in a favorable cost-benefit from the policies, the correct decision is to do nothing. That could also be the final answer.

    (The use of triangular distribution functions to predict best estimates is based on notes from W. C. Hauber, Shell Oil Company, circa 1967. For more details on the methodology, refer to https://en.wikipedia.org/wiki/Three-point_estimation.)

    • Tom ==> The Comments section is not really a very good venue for introducing major new concepts, either about Climate science or Statistics.
      I might suggest that you write a major essay and submit it via the Submit Story link on the menu bar at the top of the page.
      You can be sure to get a beating for it — but you’ll know how well the idea flies in the wild.
      Good luck with it.

      • Tom ==> I only mean to encourage you to out your ideas out there where they’ll get the right amount of attention. Buried in comments they won’t get noticed.
        I am not a statistician and this essay wasn’t about statistics but something far more mundane. Don’t be discouraged, go for it if you are confident of your position,

  50. Yours is an excellent general point. I have made the same point, but from a much different perspective. My example is that of one machined part being run on three different screw machines. In my example we are measuring the outside diameter of a bearing shoulder. If I measure 30 samples at random from the total produced by all three machines my information is only valuable if each machine produces the same part on average with the same amount of variability. That is almost never the case. I could also measure 10 samples from each machine and compute an average of the three averages and an average variability. That is slightly better information but still doesn’t tell me a lot. No, I have to look at the average and range for each machine as an independent system in order to know how the parts are really running. Even in the systems being studied are very similar the grand average is misleading.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s