The Laws of Averages: Part 2, A Beam of Darkness

Guest Essay by Kip Hansen

 

beam_of_darknessThis essay is second in a series of essays about Averages — their use and misuse.  My interest is in the logical and scientific errors, the informational errors, that can result from what I have playfully coined “The Laws of Averages”.

Averages

As both the word and the concept “average” are subject to a great deal of confusion and misunderstanding in the general public and both word and concept have seen an overwhelming amount of “loose usage” even in scientific circles, not excluding peer-reviewed journal articles and scientific press releases,  I gave a refresher on Averages in Part 1 of this series.  If your maths or science background is near the great American average, I suggest you take a quick look at the primer in Part 1 before reading here.

A Beam of Darkness Into the Light

The purpose of presenting different views of any data set — any collection of information or measurements about a thing, a class of things, or a physical phenomenon — is to allow us to see that information from different intellectual and scientific angles — to give us better insight into the subject of our studies, hopefully leading to a better understanding.

Modern statistical [software] packages allow even high school students to perform sophisticated statistical tests of data sets and to manipulate and view the data in myriad ways.  In a broad general sense, the availability of these software packages now allows students and researchers to make [often unfounded] claims for their data  by using statistical methods to arrive at numerical results — all without understanding either the methods or the  true significance or meaning  of the results.  I learned this by judging High School Science Fairs and later reading the claims made in many peer-reviewed journals.  One of the currently hotly discussed controversies is the prevalence of using “P-values” to prove that trivial results are somehow significant because “that’s what P-values less than 0.05 do”.  At the High School Science Fair, students were including ANOVA test results about their data –none of them could explain what ANOVA was or how it applied to their experiments.

Modern graphics tools allow all sorts of graphical methods of displaying numbers and their relationships.   The US Census Bureau has a whole section of visualizations and visualization tools. An online commercial service, Plotly,  can create a very impressive array of visualizations of your data in seconds.  They have a level of free service that has been more than adequate for almost all of my uses [and a truly incredible collection of possibilities for businesses and professionals at a rate of about a dollar a day].  RAWGraphs has a similar free service.

The complex computer programs used to create metrics like Global Average Land and Sea Temperature or Global Average Sea Level are believed by their creators and promoters to actually produce a single-number answer, an average, accurate to hundredths or thousandths of a degree or fractional millimeters.  Or, if not actual quantitatively accurate values,  at least accurate anomalies or valid trends are claimed.  Opinions vary wildly on the value, validity, accuracy and precision of these global averages.

Averages are just one of a vast array of different ways to look at the values in a data set.  As I have shown in the primer on averages, there are three primary types of averages  — Mean, Median, and Mode — as well as a number of more exotic types.

In Part 1 of this series, I explained the pitfalls of averages of heterogeneous, incommensurable objects or data about objects.  Such attempts end up with Fruit Salad, an average of Apples-and-Oranges:  illogical or unscientific results, with meanings that are illusive, imaginary, or so narrow as not to be very useful.  Such averages are often imbued by their creators with significance — meaning — that they do not have.

As the purpose of looking at data in different ways — such as looking at a Mean, a Median, or a Mode of the numerical data set — is to lead to a better understanding, it is important to understand what actually happens when numerical results are averaged and in what ways they lead to improved understanding and in what ways they lead to reduced understanding.

A Simple Example:

Let’s consider the height of the boys in Mrs. Larsen’s hypothetical 6th Grade class at an all boys school.  We want to know their heights in order to place a horizontal chin-up bar between two strong upright beams for them to exercise on (or as mild constructive punishment — “Jonny — ten chin-ups, if you please!”).  The boys should be able to reach it easily by jumping up a bit so that when hanging by their hands their feet don’t touch the floor.

The Nurse’s Office supplies the heights of the boys, which are averaged to get the arithmetical mean of 65 inches.

Using the generally accepted body part ratios we do quick math to approximate the needed bar height in inches:

Height/2.3 = Arm length (shoulder to fingertips)

65/2.3 = 28 (approximate arm length)

65 + 28 = 93 inches = 7.75 feet or 236 cm

 Our calculated bar height fits nicely in a classroom with 8.5 foot ceilings, so we are good.   Or are we?  Do we have enough information from our calculation of the Mean Height?

Let’s check by looking at a bar graph of all the heights of all the boys:

boys

This visualization, like our calculated average, gives us another way to look at the information, the data on the heights of boys in the class.  Realizing that because the boys range from just five feet tall (60 inches) all the way to almost 6 feet (71 inches) we will not be able to make one bar height that is ideal for all.  However, we see now that 82% of the boys are within 3 inches either way of the Mean Height and our calculated bar height will do fine for them.  The 3 shortest boys may need a little step to stand on to reach the bar, and the 5 tallest boys may have to bend their legs a bit to do chin ups.  So we are good to go.

But when they tried the same approach in Mr. Jones’ class, they had a problem.

There are 66 boys in this class and their Average Height (mean) is also 65 inches, but the heights had a different distribution:

boys_2

Mr. Jones’ class has a different ethnic mix which results in an uneven distribution, much less centered around the mean.  Using the same Mean +/- 3 inches (light blue) used in our previous example, we capture only 60% of the boys instead of 82%.  In Mr. Jones class,  26 of the 66 boys would not find the horizontal bar set at 93 inches convenient.  For this class, the solution was a variable height bar with two settings:  one for the boys 60-65 inches tall (32 boys), one for the boys 66-72 inches tall (34 boys).

For Mr. Jones’ class, the average height, the Mean Height, did not serve to illuminate the information about boys’ height to allow us to have a better understanding.   We needed a closer look at the information to see our way through to the better solution.  The variable height bar works well for Mrs. Larsen’s class as well, with the lower setting good for 25 boys and the higher setting good for 21 boys.

Combining the data from both classes gives us this chart:

boys_3

This little example is meant to illustrate that while averages, like our Mean Height, serve well in some circumstances, they do not do so in others.

In Mr. Jones’ class, the larger number of shorter boys was obscured, hidden, covered-up, averaged-out by relying on the Mean Height to inform us of the best solutions for the horizontal chin-up bar.

It is worth noting that Mrs. Larsen’s class, shown in the first bar chart above, has a distribution of heights that more closely mirrors what is called a Normal Distribution, a graph of which looks like this:

normal_distribution

Most of the values are creating a hump in the middle and falling off evenly, more or less, in both directions.    Averages are good estimations of data sets that look like this if one is careful to use a range on either side of the Mean.    Means are not so good for data sets like Mr. Jones’ class, or for the combination of the two classes.  Note that the Arithmetical Mean is exactly the same for all three data sets of height of boys  — the two classes and the combined — but the distributions are quite different and lead to different conclusions.

US Median Household Income

A very common measure of economic well-being in the United States is the US Census Bureau’s annual US Median Household Income.

First note that it is given as a MEDIAN — which means that there should be an equal number of families above this income as families below this income level.  Here is the chart that the political party currently in power — regardless of whether it is the Democrats or the Republicans — with both the Oval Office (US President)  and both houses of Congress in their pocket, will trot out:

The-Good-News

That’s the Good News! graph.   Median Family Income on a nice steady rise through the years, we’re all singing along with the Fab Four “I’ve got to admit it’s getting better, A little better all the time…

This next graph is the Not So Good News graph:

not_so_good_news

The time axis is shortened to 1985 to 2015, but we see that families have not been gaining much, if at all, in Real Dollars, adjusted for inflation, since about 1998.

And then there is the Reality graph:

MHI_49Yrs

Despite the Good News! appeal of the first graph, and the so-so news of the second, we see that if we dig below the surface, looking  at more than just the single-numeral Median Household Income by year, we see a different story — a story obscured by both the Good News and the Not So Good News.  This graph is MEAN Household Income of the five quintiles of income, plus the Top 5%,  so the numbers are a bit different and it tells a different story.

Breaking the population into five parts (quintiles), the five brightly colored lines, the bottom-earning 60% of families, the green, brown and red lines,  have made virtually no real improvement in real dollars since 1967.  The second quintile,  the middle/upper middle classes in purple, have seen a moderate increase.  Only the top 20% of families (blue line) have made solid  steady improvement — and when we break out the Top 5%, the dashed black line, we see that not only do they earn the lion’s share of the dollars, but  they have benefited from the lion’s share of the percentage gains.

Where are the benefits felt?

MHI_National

Above is what the national average, the US Median Household Income metric, tells us.  Looking a bit closer we see:

MHI_by_State

Besides some surprises, like Minnesota and North Dakota, it is what we might suspect.  The NE US: NY, Massachusetts, Connecticut, NJ, Maryland, Virginia, Delaware — all coming in at the highest levels, along with California, Washington.  Utah has always had the more affluent Latter-Day Saints and along with Wyoming and Colorado has become a retirement destination for the wealthy.  The states whose abbreviations are circled have state averages very near the national median.

Let’s zoom in:

MHI_by_County

The darker green counties have the highest Median Household Incomes.  It is easy to see San Francisco/Silicon Valley in the west and the Washington DC-to-NYC- to-Boston megapolis in the east.

This map answered my big question:  How does North Dakota have such a high Median Income?  Answer:  It is one area, circled and marked “?”, centered by Williams County, with Williston as the main city.  The area has less than 10,000 families.  And “Williston sits atop the Bakken formation, which by the end of 2012 was predicted to be producing more oil than any other site in the United States,” it  is the site of Americas latest oil boom.

Where is the big money?  Mostly in the big cities:

MHI_by_County_Cities

And where is it not?  All those light yellow counties  are areas in which many to most of the families live at or below the federal Poverty Line for families of four.

Income_on_the_Res

An overlay of US Indian reservations reveals that they are, in the west particularly, in the lowest and second lowest income brackets. (An interest of mine, as my father and his 10 brothers and sisters were born on the Pine Ridge in southwestern South Dakota, the red oval.)   One finds much of the old south in the lowest bracket (light yellow), and the deserts of New Mexico and West Texas and the hills of West Virginia and Kentucky.

One more graphic:

Household_Income_distrib_pe

What does this tell us?

It tells us that looking at the National Median Household Income, as a single-number–especially in dollars unadjusted for inflation–presents a picture that obscures, hides, whitewashes over the inequalities and disparities that are the important facts of this metric.   The single number, National Average  (Median) Household Income number tells us only that one very narrow bit of information — it does not tell us how American families are doing income-wise.  It does not inform us of the economic well-being of American families  — rather it hides the true state of affairs.

Thus, I say that the publicly offered Average Household Income, rather than shedding light on the economic well-being of American families, literally shines a Beam of Darkness that hides the real significant data about the income of America’s households.   If we allow ourselves to be blinded by the Beam of Darkness that these sort of truth-hiding averages represent, then we are failing in our duty as critical thinkers.

Does this all mean that averages are bad?

No, of course not.  They are just one way of looking at a batch of numerical data.  The are not, however, always the best way.  In fact, unless the data one is considering is very nearly normally distributed and changes are caused by known and understood  mechanisms, averages of all kinds more often lead us astray and obscure the data we should really be looking at.   Averages are a lazy man’s shortcut and seldom lead to a better understanding.

The major logical and cognitive fault is allowing one’s understanding to be swayed, one’s mind to be made up, by looking at just this one very narrow view of the data — one absolutely must recognize that the view offered by any type of average is hiding and obscuring all the other information available, and may not be truly representative of the overall, big picture.

Many better methods of data analysis exist, like the simplistic bar chart used in the school boys’ example above.  For simple numerical data sets, charts and graphs, if used to reveal (instead of hide) information are often appropriate.

Like averages, visualizations of data sets can be used for good or ill  — the propaganda uses of data visualizations, which now include PowerPoints and videos, are legion.

Beware of those wielding averages like clubs or truncheons to form public opinion.

And climate?

The very definition of climate is that it is an average — “the weather conditions prevailing in an area in general or over a long period.”  There is no single “climate metric” — no single metric that tells us what “climate” is doing.

By this definition above, pulled at random from the internet via Google, there is no Earth Climate — climate is always “the weather conditions prevailing in an area in general or over a long period of time”.   The Earth is not a climatic area or climate region, the Earth has climate regions but is not one itself.

As discussed in Part 1 — the objects in sets to be averaged must be homogeneous and not so heterogeneous as to be incommensurable.  Thus, when discussing the climate of a four-season region, generalities are made about the seasons to represent the climatic conditions in that region during the summer, winter, spring and fall, separately.  A single average daytime temperature is not a useful piece of information to summertime tourists if the average is taken for the whole year including the winter days — such an average temperature is foolishness from a pragmatic point of view.

Is it also foolishness from a Climate Science point of view?  This topic will be covered in Part 3 of this series.   I’ll read your comments below — let me know what you think.

Bottom Line:

It is not enough to correctly mathematically calculate the average of a data set.

It is not enough to be able to defend the methods your Team uses to calculate the [more-often-abused-than-not] Global Averages of data sets.

Even if these averages are of homogeneous data and objects, physically and logically correct, averages return a single number and can incorrectly be assumed to be a summary or fair representation of the whole set.

Averages, in any and all cases, by their very nature, give only a very narrow view of the information in a data set — and if accepted as representational of the whole, will act as a Beam of Darkness, hiding  and obscuring the bulk of the information;   thus,  instead of leading us to a better understanding,  they can act to reduce our understanding of the subject under study.

Averages are good tools but, like hammers or saws, must be used correctly to produce beneficial and useful results. The misuse of averages reduces rather than betters understanding.

# # # # #

Author’s Comment Policy:

I am always anxious to read your ideas, opinions, and to answer your questions about the subject of the essay, which in this case is Averages, their uses and misuses.

As regular visitors know, I do not respond to Climate Warrior comments from either side of the Great Climate Divide — feel free to leave your mandatory talking points but do not expect a response from me.

I am not an economist — nor a national policy buff – nor interested in US Two-Party-Politics squabbles.  Please keep your comments to me to the question of the uses of averages rather than the details of the topics used as examples.  I actually have had experience in building exercise equipment for a Youth Camp.

I am interested in examples of the misuse of averages, the proper use of averages, and I expect that many of you will have varying opinions regarding the use of averages in Climate Science.

# # # # #

Advertisements

  Subscribe  
newest oldest most voted
Notify of
BallBounces

Thank you for this above average effort!

Good one.

ReallySkeptical

“Thank you for this above average effort!”
Think it just average text book dribble. You are causing an “Lk Wobegon effect”.

CarolinaCowboy

Pun intended, I assume.

Terry Williams

Having two legs I am slightly above the average number . No one has three but many have one ,some have non ??? Am I right ?

Ben of Houston

Terry, I am obliged to feel offended on behalf of mutants and conjoined twins everywhere.

drednicolson

And at least one lady with four.

John Haddock

Should be a required annual refresher and test for every journalist.

Mick In The Hills

I’ll never regard average numbers in the same light after reading your expose Kip.
Thank you.

“he complex computer programs used to create metrics like Global Average Land and Sea Temperature or Global Average Sea Level are believed by their creators and promoters to actually produce a single-number answer, an average, accurate to hundredths or thousandths of a degree ”
Wrong. The observational data is used to create a spatial prediction of the unmeasured locations. This prediction is referred to as an average. Really a misnomer. The precision of the prediction does not state our knowledge about the observations but rather minimizes the error of prediction.
Operationally it means this.
When we say the global average is 15.567 that mean…
If you select a random unmeasured location…and do this
Say 1000 times that the prediction 15.567 will give you a minimum error..It will produce less error than say 15
The observations are not averaged in an any simple sense of the word. I used to think that. Then I read some code.
Then I studied the theory of spatial prediction.
You might try that.
Finally I have explained this many times kip.
Ignoring those explanations means you cannot and do not engage in critical or skeptical thinking. You fool yourself.

John Bills

And off you go…………

Jose Melkander

Mosher [snip . . . pointless rants don’t add to what we know. Try harder to teach or learn . . . mod]

Duster

In shooting you take into account two factors: precision and accuracy. Precision indicates high consistency of aim. The shots cluster together on the target, even though they may fall well away from the intended point of aim. Precision reflects the shooter’s steadiness of grip and quality of eyesight.
Accuracy is a different thing. It is a measure of how close to the actual point of aim a round falls. In the real world, accuracy is far more important than precision. That is because an accurate rifleman has a higher likelihood of making a “useful” shot, than a precise shooter whose weapon is not properly cited in, or who cannot “accurately” judge the range and effects of weather on shooting. Ten precise but inaccurately placed rounds are a waste.
So respecting climate models the real question is do models accurately bracket real conditions, or do they cluster precisely with a bias in the aim. Every plot of climate models I have seen shows that they are precise but biased rather than accurate, if somewhat scattered. It isn’t the precision that is the problem but rather the lack of accuracy. The bias indicates poor “sight adjustment.”

David S.

You’re confusing precision with repeatability. Repeatability is the indication of consistency (how close the shots cluster). Precision is how thin are the circles on the target you are using to measure your accuracy and repeatability (or what is the subdivision on the tape measure you use).
I can be very precisely very wrong: “I missed the target by 1.00003401 miles”
I can be very in-precise in my claim of accuracy: “I was within 0 light years of the target”
I can have a precise measurement with high inaccuracy: “3.312 +/- 2.500 mm”
and significant figures are not the same thing as precision (even if they are sometimes used to approximate it): the decimal number 30 has 1 significant digit. the binary number 11110 has 4 significant digits. They both indicate the exact same quantity but if I were to use the significant figures as an indication of precision I get 10 (decimal) in one case and 2 (decimal) in the other. I just don’t know what the precision is unless I have many numbers from the same instrument to compare (or the specs of the instrument). For example if I see a measurement of 3.125″, is my precision 1/1000th of an inch or 1/8th or maybe even 1/16th or 1/32nd? I don’t know unless I look at the measuring device (tape measure or calipers?) or look at enough other measurements to infer it.

Jim Gorman

David S,
You are obviously not familiar with the shooting sports. In addition though, your post highlights the need for a precise definition of what is being measured and what how the measurements are used.

benofhouston

David, significant figures aren’t universal precision. However, they are used to show precision if there are no error given. The presumption of 5.2 means that it is between 5.15 and 5.25 if there is nothing else given.
If you have a different estimate of error and do not give it, then you are being deceptive.
Finally, repeatability and precision are normally used as synonyms, even in the law (40 CFR 60 Appendix B uses it this way). You are mentioning only precision of speech, not precision of measurement.

George Tetley

I have bought a lottery ticket with the same numbers for 2 years! Why you may ask? Ah, the law of averages!

Tim Hammond

If that is true, then your methodology is faulty. Averaging for precision only works if you are measuring the same thing many times, not different things many times. Temperature can only be measured once, because it only occurs once. You cannot measure the temperature of a second again ever again. Of course you can measure it at that instant many times, but that’s not what you are claiming you do.
Temperature changes in the fourth dimension, unlike say a strip of metal.
And of course you are not measuring but extrapolating, because it is, as you say “unmeasured”, then you are not measuring at all anyway.
So how can you claim that measuring something that is not measured is is measured?

Jake

To Steven Mosher: In reference to “Operationally it means this.
When we say the global average is 15.567 that mean…
If you select a random unmeasured location…and do this
Say 1000 times that the prediction 15.567 will give you a minimum error..It will produce less error than say 15”
So help me understand the process here. The code is designed, through using the theory of spacial prediction, to minimize if not entirely eliminate the error associated with the prediction. BUT, the prediction, theoretically, could deviate substantially from reality, yes? Is there a link between the precision of the predicted value vs. the precision to predict the temperature in the unmeasured location? Have we taken the time (and, obviously, money) to test these predictions by measuring some of these locations?
In other words …..
15.567 +/- 0.002, (wow, really precise prediction, let’s go to Vegas)
15.1 +/- 0.1 (damn thermometers, if only we could make them better)
But the precision in the prediction is poor, yes? The Falcons still lost the Super Bowl …….
Am I reading this correctly? If not, please clarify.

Jake

Kip: Thank you, link followed and paper printed. I’m looking forward to figuring this out.

Paul Penrose

Kip,
The real question is: do we know what the real error bars are for their final product (global average temperature)? I don’t think with the information we have that it is truly calculable.

MarkW

“The observational data is used to create a spatial prediction of the unmeasured locations.”
Anyone who thinks that you can make 2 observations 1000 miles apart, and from them predict what is happening at the mid point is no scientist.

[name-calling snipped by author – violates WUWT policy] Mosher,
If you say that the average is 15.567, but the instruments are only accurate to + – 0.5, all you did was lie. The correct figure would be 15.5 + – 0.5 but somehow you never say that…

David S.

But 15.567 +/- 0.5 is more truthful. the temp could be between 15.067 and 16.067. if I say 15.5 +/- 0.5 then I am saying it is between 15 and 16 when it actually might be slightly over 16. It all boils down to if I think anyone cares about the .067 . It’s more illustrative for 15.75 +/- 0.5 though because it is more likely someone might care about the 0.25 even with the 0.5 accuracy.

benofhouston

David, If he said 15.567+/- 0.5, then I would agree with you, but he didn’t. People have been giving these in simple numbers, “15.567”. As we tell every child in freshman chemistry that means “15.567 +/- 0.0005” unless stated otherwise.
In short. HE SHOULD HAVE STATED OTHERWISE.

David S. this is not correct. You just cannot add significant digits to a thermometer accurate only to +/- 0.5 degrees C. This is just inventing accuracy that the instrument does not possess, making up meaningless numbers. 15.5 +/- 0.5 C is the correct way to report this average. I am not sure that any “Climate Scientist” would ever admit this because they love to see those headlines about “Hottest Year EVAH” even though this is scientifically meaningless.

Steve Case

The average of 49 and 51 is 50 and the average of 1 and 99 is also 50. I’ve posted that one a few times.
The average temperature does not consider the Maximum and Minimum temperatures and as a result, much of the important information is lost by doing so. Here’s a U.S. map of color coded Maximum, summer temperature trends from NOAA’s Climate at a Glance which DOES give you the Max and Mins.
http://oi68.tinypic.com/95vcec.jpg
You don’t get that result if you just look at averages.

DHR

But don’t the warmists claim that the effect of CO2 forcing is to increase nighttime minimum temperature, not daytime maximum temperature. If Steve Case’s info is correct, and the warmists are correct, should we not be seeing a gradual decrease in the range of daily min-max temperatures? Do we?

DHR – maybe; but many OTHER things cause the same effect. Long term min max’s show the same thing for ‘some’ areas, particularly cities (UHI?) but clouds and a number of other things can do the same thing. Correlation is one thing but how do you show Causation when there are so many variables. One of Bob Tisdale’s books covers this issue in some detail with some good graphics of convergence/divergence by year and by latitude.

Clyde Spencer

DHR,
The answer to your question is “Yes” and “No.’ Take a look at the graphs in my article at http://wattsupwiththat.com/2015/08/11/an-analysis-of-best-data-for-the-question-is-earth-warming-or-cooling/

Leonard Lane

120 F in Phoenix and 115 F in Tucson today. Not hotter than ever and not near the average.

Alan McIntire

I think you’re spot on. Global warming caused by more greenhouse gases should cause overall temperature increases with a decrease in day-night temperature differences.
Warming by the sun, or other outside sources, should cause an overall increase temperature with the day- nignt temperature difference ratio staying about the same.

RP

Alan McIntire June 20, 2017 at 5:28 am:

Global warming caused by more greenhouse gases should cause overall temperature increases with a decrease in day-night temperature differences.

Shouldn’t it be the other way around?
The greenhouse effect works by absorbing a more-or-less constant fraction of the outgoing surface radiance and recycling a more-or-less constant fraction of that back to the surface, thereby recycling a more-or-less constant fraction of the outgoing surface radiance back to the surface overall.
But there is greater outgoing surface radiance on the warmer day-side of the planet than on the cooler night-side, so the recycled fraction must also be greater on the day-side, thereby causing a greater surface warming on the day-side than on the night-side. If in fact the night-side is warming faster than the day-side, then I think that could not be caused by an increase in the strength of the greenhouse effect.
The same argument applies to all areas of the surface that are warmer or cooler than others, of course, and the greatest difference in warming-rates due to an increase in the greenhouse effect should arise between the tropics and the polar regions, with warming in polar regions being slightest and warming in the tropics being greatest.
I have never understood the alarmists’ idea that the greenhouse theory somehow implies that warming should occur faster on the night-side and in the polar regions of the planet. It seems the direct opposite of what the theory actually predicts to me.

Alan McIntire

In reply to RP. I originally started to investigate this issue when I read some posting purporting to prove the Stefan-Boltzmann law “wrong” based on lunar temperatures. You might find this link, regarding Newton’s law of cooling, of interest.
http://www.ugrad.math.ubc.ca/coursedoc/math100/notes/diffeqs/cool.html
The law gives this equation:
T(t) = Ta + (T0 -Ta)*1/(e^kt)
Where T(t) gives Temperature, T, as a function of time, t,
Ta is ambient background temperature, and T0 is the starting temperature of the body warming up or cooling off.
mass atmosphere = 5* 10^18 kg=5*10^21gm
temp atmosphere 255K (effective radiating temp to space- underestimates heat content of total atmosphere)
specific heat 1.01 joules/gm C
5* 10^21*gm1.01 joules/gm*255 K= 1.288 * 10^24 joules
radius earth = 6400km= 6.4*10^6 meters.
area earth = 4 pi r^2 =514,718,540,364,021.76
240 watts/sq meter = 240 joules/sec per square meter
60 sec/min*60 min/hr*24hr/day=86,400 secs per day
5.147* 10^14 sq meters*240 joules/sec/sq meter *8.64*10^4 secs/day= 1.067*10^22 joules per day radiated away
1.067*10^22/1.288*10^24 = 0.83%
So the daily loss of heat of the atmosphere is less than 1% per day. That makes sense when you realized that although surface
temperatures may swing by 20 degrees K or more during the 24 hour day/night cycle thanks to direct radiation to space from earth’s surface, meteorologists are still able to make fairly accurate estimates of daily highs and lows for about a week- because of that temperature stability of the atmosphere. Since the temperature for most of the atmosphere remains about the same throughout a 24 hour day, we continue to get the same daytime radiation from the sun, but the radiation from the atmosphere increases by the same amount both day and night. Since temperature is proportional to the fourth ROOT of radiation, that implies more warming at night than during the day from additional greenhouse warming.

RP

Alan McIntire June 20, 2017 at 4:06 pm.
Thanks for explaining your thinking behind your proposition that additional greenhouse warming will produce more warming at night than during the day.
I must confess that I found your argument hard to follow. I cannot see the relevance of Newton’s law of cooling, nor your subsequent calculation of the percentage of atmospheric heat content that is radiated to space daily, interesting as these items of intellectual stimulation may be. In fact, your complete argument seems to be contained in your last two sentences, where you say:

Since the temperature for most of the atmosphere remains about the same throughout a 24 hour day, we continue to get the same daytime radiation from the sun, but the radiation from the atmosphere increases by the same amount both day and night. Since temperature is proportional to the fourth ROOT of radiation, that implies more warming at night than during the day from additional greenhouse warming.

I have highlighted what seems to me to be your key-assumption in the quotation above. Where have you got that from? How is it implied in the theory of the greenhouse effect? I can see no rational justification for it.

Alan McIntire

note that equation
T(t) = Ta + (T0 -Ta)*1/(e^kt)
During a 24 hour day, the earth receives about the same amount of radiation from the atmosphere, call it
X, which will be some fraction of the average daily radiation we get from the sun.
The warming we get from the sun changes constantly, but take an average for the 12 hour daytime
period and call it 1.
In this case, the ambient background temperature would be proportional to
(1+X)^0.25 thanks to the Stefan-Boltzmann constant.
Then during the daytime, the earths surface starts out with T0 less than Ta and starts
warming up trying to reach Ta.
At night, the ambient background temperature, Ta, becomes directly proportional to
(X)^0.25 thanks to that same Stefan-Boltzmann constant. At night, T0 at earth’s surface is WARMER
than the ambient background temperature, Ta,so the earth starts to cool off to a temperature
proportional to the fourth root of X.
Now throw in additional greenhouse gases so the radiation from the atmosphere to earth’s surface
increases to X +Y, where X+Y are positive numbers. If they’re not both positive numbers, that means the greenhouse effect can cause COOLING. As an aside, it DOES cause cooling in some frequencies on Saturn’s moon Titan, at Titan’s low temperature..
Now compare the (new daytime ambient temperature/old daytime ambient temperature) and compare it to
(new nighttime ambient temperature/old nighttime ambient temperature.
You’ll get
(1+X+Y/1+X)^0.25 for new daytime Ta/ old daytime Ta, which will ALWAYS be smaller than
(X+Y//X)^0.25.
Try it for any numbers.
The rest may be very boring and you can skip it
Check the article on newton cooling and you’ll also see how
to derive the k experimentally from 1 hour of cooling, assuming you have a local climate and not local weather. Although the air gains or loses heat on the order of less than 1% per day, clouds make up about 1/6 of the total greenhouse effect, and cloud weather can fluctuate by plus or minus 50 watts or so over short intervals, giving us weather rather than climate.
I couldn’t solve T(t) = Ta + (T0 -Ta)*1/(e^kt) exactly, with Ta constantly changing, but I tried a numerical approximation.
With the sun constantly changing elevation and radiation, I picked an average latitutde, 30 degrees, and
earth at the equinox, I looked up a table of sun angles at various times in a 12 hour day,
used 12 hourly figures for various Ta amounts, and tried to get a reasonable figure for
warming.
Before I could solve the daytime figure, I got a number for e^kt. I got that by picking
average sundown temperature T0. ambient background temperature, I used Trenberth’s figure of
333 watts
to get ambient background temperature.
http://theinconvenientskeptic.com/wp-content/uploads/2010/11/FT08-Raw.png
The Stefan-Boltzmann equation for a blackbody goes
T(degrees Kelvin) = S(constant)*(watts/square meter)^0.25. Our first step is to find that S constant.
Doing a google search, I find 1000 K implies a blackbody flux of 56790 watts/square meter.
T = 64.77867 W^0.25. I plugged in 333 nighttime watts using Trenberth’s figure, and got
Ta nighttime ambient temperature is 276.72 K.
I plugged in T0, average sundown temperature of about 293K
Giving
T(t) = Ta + (T0 -Ta)*1/(e^kt)= 276.72 +(293 -276.72)/(e^kt)
I needed an additional figure, T(t) for a period of hours, then I could solve for k.
I got something like k=1.07053. I made the assumption that the same k works for daytime as for nighttime, otherwise the problem would be unsolvable.
I played around with daytime warming, nighttime cooling, and finally got a balance.
I plugged in additional greenhouse warming, and got a higher balance both daytime and nighttime, with about 1/3 of additional average warming happening during the day, and 2/3 of the additional warming happening at night. In doing so, I convinced myself that the “Stefan-Boltzmann” deniers didn’t know what they were talking about.

RP

Alan McIntire June 22, 2017 at 5:44 am:
Thanks for this elaboration of your argument. However, I remain unconvinced by it.
My principal objection is that you are treating the atmospheric radiance from the greenhouse gases (represented by Y in your argument) as being isotropic over the globe, which is necessarily untrue by definition of what the greenhouse effect is.
The greenhouse effect is defined as the warming that results from the atmosphere’s relative transparency to incoming shortwave radiation and its relative opacity to outgoing longwave radiation. Since there is practically no incoming shortwave radiation on the night-side of the planet, there is no greenhouse effect on that side either and therefore the value of Y for the night-side is eternally 0.
Thus, your expression for new/old night-time Ta in your statement

You’ll get
(1+X+Y/1+X)^0.25 for new daytime Ta/ old daytime Ta, which will ALWAYS be smaller than (X+Y//X)^0.25.

i.e. (X+Y//X)^0.25, should be amended to {(X+0)/X}^0.25, which will always be equal to 1 regardless of the value of X.
On the other hand, the value of your expression for new/old daytime Ta, i.e. {(1+X+Y)/(1+X)}^0.25, will always be greater than 1 and will always increase in accordance with increasing values of Y.
Hence, increases in the strength of the greenhouse effect can only ever cause the difference between average day-time and night-time temperatures to increase and never to decrease as your argument contends.

Let’s call this map a “Beam of Dusk.” The vast majority of States have so many climate regions that a “State average” is almost as meaningless as a “National average.”
I’d like to see one that breaks it down to “temperature trend by county,” like the median income one – that would be far more meaningful. (I would make a small side bet, thanks to UHIs, that it would be almost identical to the income map, too, if you used the same coloring scheme.)

@Kip, seen many just like that one. Nice for knowing the general climate of an area – but useless considering the grid size (any specific place within the region may be very little like the “regional” climate).
However, if overlaid on the (at least county scale) trend map that I really want – hmm. Some very interesting things might pop out.

Duster

One of the problems with gridded data is the loss of detailed geographical influence. If you look over at Climate Audit, Nic Lewis has an interesting discussion on the influence of grid scale on estimates.

@Duster – that’s one of the basic problems with “climate modeling.” The current smallest model grid sizes are (barely, and that is arguable) good enough for a “proof of concept” model. Nowhere near anything useful. At ten kilometer resolution, you might start to get some useful results. Somewhere between one and five kilometers, you’d have a pretty good model, assuming everything else was correct… (Slight addenda here – some regions require subgrids with much finer resolution than one kilometer. And some others could be as much as 100 kilometers without much loss of information. This ignores the vertical dimension, too, which I believe will need to be something like 100 meters resolution at the very least.)
We don’t have that kind of resolution in measurements for model validation, by any means; nor do we have the processing systems to handle such a model. Which is why current “models” are completely useless.

Kip Hansen. (Writing to Observer)

Observer ==> There is this, the Koppen Climate regions map of North America:

A good display of regional (multi-state area) “climates” across the United States and Canada – using NASA/Hansen’s worthless and misleading Mercator Projection rectangles, but I digress.
Now, show the “pre-CO2” current world Global Circulation Model RESULTS and the grid rectangles that define the results of the 24 models that duplicate that map – both land and oceans and mountains and plains – to 0.20 degree accuracy.

RACookPE1978 June 20, 2017 at 4:39 am
Kip Hansen. (Writing to Observer)
Observer ==> There is this, the Koppen Climate regions map of North America:
A good display of regional (multi-state area) “climates” across the United States and Canada – using NASA/Hansen’s worthless and misleading Mercator Projection rectangles, but I digress.

Did Hansen use Mercator projections, the copies of his papers that I’ve seen didn’t?

That is a really good FIND – and sooooo interesting .
Definitely a graph you will NEVER see published in the MSM

“….and the average of 1 and 99 is also 50.”
I don’t think anyone at the Fed will ever admit this, but I believe one of the reasons that everybody and their brother missed the looming financial crisis was that economists and policy makers relied on National home price and foreclosure statistics rather than drilling down and looking at the discrete elements. We did not have a homogenous foreclosure problem. There were housing markets that should have been telling those in Washington that a huge bubble was forming while in other markets only above average price increases were experienced.. When the foreclosures started, they were not uniformly distributed across the country. California foreclosure rates increased by 2000% between 2005 and 2009. Florida and Arizona and a few other states had very large increases as well. At the same time other states had less than double the foreclosure rates. A few states had barely a blip of foreclosures and nothing out of the ordinary.
Looking at the National averages of home price increases and foreclosure rate increases did not disclose what was actually happening throughout the country. Sometimes averages don’t tell us a thing.

Thank you.

Frank

Steve: What is a “maximum summertime temperature”? The average high temperature EVERY DAY for ALL of stations in a state for JJA? The single highest recording on ANY summer DAY at ANY STATION in a state? What is and isn’t a “significant” decline over what period in this map?
I found my way to this website, but it still wasn’t clear how to reproduce your map
https://www.ncdc.noaa.gov/cag/

Frank

Kip: Thank your for pointing out the comment above, which I missed. Since the output is about 6000 lines, that would be one high record per state per season per each of 105 years. FWIW, I suspect that many record temperatures are due to artifacts. I also suspect that the number of artifacts is gradually decreasing with time. With a dozen or dozens of temperature readings being taken in the average state every day, I doubt there is anything useful to be learned from a relative handful of extremes. Perhaps a future article will prove my snap judgment incorrect.
Respectully, Frank

Tom Halla

Good commentary. I took statistics some 40 years ago, and this has been a rather better exposition on the faults of statistical measures than the class was.
The great advantage of the internet is that it is cheap and easy to show the actual graphs of the data one is using the statistical measures to describe. A skewed or bimodal distribution can be shown directly, rather than just describing it.
What seems to matter with global temperatures is the geographic distribution, not a single average. If, as appears to be the case, that “warming” is almost entirely in high latitudes, one can show that, rather than just use not entirely clear descriptions of a map.

The calculated global average temperature is useless because it is averaging source areas (apples) with sink areas (oranges). The mass balance used in the models is even worse because it assumes that natural source and sink rates balance out (a false assumption). The natural source areas for CO2 produce around 20 times anthropogenic emissions.

commieBob

I learned this by judging High School Science Fairs and later reading the claims made in many peer-reviewed journals.

One of my buddies had a thesis adviser who grumped that students, given a formula, would attempt to use it no matter how inappropriate it was for the situation.
Engineers usually get that beaten out of them before they get their licenses. I suspect that some scientists never do.
As you observe, just because someone can use tools to generate a number, it doesn’t mean that they actually understand what they’re doing. On the other hand, they are likely to think they know what they’re doing … and that’s a very bad thing.

“On the other hand, they are likely to think they know what they’re doing … and that’s a very bad thing.”
Amen brother, amen.

braynor2015

> Engineers usually get that beaten out of them before they get their licenses.
Ahh, no.

SocietalNorm

One of the things I used to do in a previous job or two was to make sure that the young engineers understood what they were doing. Just because they put correct inputs into a properly functioning computer simulation doesn’t mean that the answer the computer spits out is not stupidly wrong.
There is a lot of difference between being a good engineer and a data-feeder to a simulation.

Jake

My father was a career civil engineer of some note, and worked well past normal retirement age because he enjoyed the work. When he finally left the field at the age of 70, I asked him “why now”? Being a “real” engineer, he always spoke carefully …. he paused and said “I’ve grown tired of teaching new employees how to do the job correctly.” I’ve never forgotten that conversation …….

Ed

I managed several hundred scientists whose single biggest problem was the misapplication of statistical models. Often they would collect data then apply some statistical model out of a textbook or paper to their data whether or not their data met any of the model’s assumptions. We brought in first on a part time basis and eventually hired an expert statistician to review all experiments and project prior to the project being approved. We still had scientist try to mis-use statistics.

“Averages are good tools but, like hammers or saws, must be used correctly to produce beneficial and useful results. The misuse of averages reduces rather than betters understanding.”
This was a wonderful essay and I intend to steal great portions of it to use in my own classes to illustrate the issues. Thank you for doing theses two essays and the one yet to come.
I think the part above that I quoted is very important for everyone to understand. I read “How to Lie with Statistics” as a student decades ago as well as having taught the subject. I think that people should understand how one can be mathematically correct while still pulling the wool over your eyes and misleading you. Your essay makes that abundantly clear. Bravo!

Clyde Spencer

I remember a friend making the point that after Three Mile Island, the claim was made that within a given radius of the reactor the average radiation exposure was a trivial amount above the background. What wasn’t acknowledged was that most of the radiation was in a downwind plume, and it was of consequence. Thus, the elevated exposure was hidden with an average.

Geoff Sherrington

Clyde s,
That sounds sinister.
There was no actual harm done from the plume & residue.
Geoff

Clyde Spencer

Geoff,
Whether harm was done or not, there was an attempt to use averages to down play the seriousness of the release. That is the essence of Kip’s article.

Jamie

I had this average problem with my launch monitor. For instance if i hit a six iron i might get a series like 210 213 215 208 143 200 197 204 214 135 yds. It software calculates an average of 194 yds. But I would not play the club that distance. The median is a better number 208 or 210 yds to play the shot. Unless something is normally distributed the average doesn’t mean a whole lot. I told the company about this but they laughed at me.

Michael S. Kelly

When the global average temperature anomaly is an average of 8,000 estimated temperatures, and not a single measured temperature, the situation is even worse…

M eward

It seems to me that this ‘number crunching capacity’ without the traditionally expected understanding of ‘why and how’ is a manifestation of a common problem perhaps best exemplufied by the ‘uncertainty principle’ in quantum physiscs. That is. just like speed and position are two paramaters inversely related in terms of accuracy in quantum physics, number crunching capacity and deeper mecahnistic understanding are also at risk of a similar dochotomy as the former is made available to the less highly technically educated portion of the populace. Project that onto the mainstream media and it is clar what happens. This issue has effectively been with us for ever. Traditionally the ‘shamans’/’seers’/’witchdoctors’ etc saw patterns in the environment/stars/entrails and developed an authoritive modus operandi to articulate their case… and enhance their power. SOund familiar now?

Latitude

Kip, this was so good, I read it twice…..thank you!

John M. Ware

Very interesting article. My only cavil so far is in the coverage of average income. The average income within a state or county is meaningless to the individual; I am not comforted in my relatively low position on that scale by knowing that my city or my state is well above average and, thus, far ahead of me. My comfort as an individual is that, if I am now in the lowest quintile of average income, I am by no means required to stay in that low position. In fact, if I have read correctly, the vast majority of individuals who are within the lowest quintile in a given year will be in a higher position within a few years–often, a much higher position. Thus, the use of averages says nothing of economic mobility, which is an essential part of the American dream. Yes, I say to a struggling young couple, you are in strained circumstances now; but keep working, and in a few years your situation will be far different.

Poor getting richer
“Thankfully, the story of stagnating incomes in Canada is just that, a great fictional tale. The reality is that most Canadians, including those initially in the poorest group, have experienced marked increases in their income over the past two decades.”
http://business.financialpost.com/opinion/poor-getting-richer/wcm/cfa4d2a2-c21c-4afb-bb3e-cfe08800efeb

seaice1

Kip, I must confess to not yet having read the whole report, but the approach is interesting comared to the graphs you show for USA. This is due to following individuals as they move through the quintiles, rather than focusing on the individuals who happen to be in the quintiles at any particular moment.
Thus the Canadian study revealed that absolut income for those they labelled “lower quintile” has risen by a staggering 781%. Your graphs show the bottom quintile stagnating, or actually showing a declining share of the income. That could be because the people in that bottom quintile are totally different individuals 10 years later.
The two stories are not necessarily conflicting, as it would depend on the level of mobility. I would be very interested to see a similar study from the USA to see if there was a similar mobility. My suspicion is that mobility would be less, but I am prepared to say that is based on information about social mobilty in the USA that is anecdotal rather than systematic.
I certainly did not spot a fatal flaw, nor why it was misleading. Although I must confess that at first reading I thought it was telling a very different story to the one you tell through the graphs in the article.

seaice1

Kip. The studies are detecting different things. The Canada study describes the trajectory that people often have – that is starting with a low paid job and then moving up, then eventually moving down again. The important factor is how many people fail to move up – families or housholds that seem stuck in low grade jobs. The problem of averages again, which can obscure important factors . It would be very intersting to see a similar study in USA. As I said, I suspect there would be less mobility than in Canada. It would also be interesting to see the quintile income in Canada – has there been an increase in the lower quintile that has not occurreed in the USA?

Slipstick

A well written essay with many good points, but, unless you are arguing semantics, the sentence “The Earth is not a climatic area or climate region, the Earth has climate regions but is not one itself.” is false. All of the “climate regions” of the Earth are connected and continuously exchanging energy and are, on average, one climate system.

As you describe the situation Slipstick, you are describing weather, not climate.
Supposedly, climate is weather in the long term. But one is not describing weather patterns anymore and the generic “climate” descriptions are only applicable to few locations. A nearby location 340 meters higher or lower will have different conditions.
Swaging data to represent up to 1,200km swaths of land from a minimal number of other sensors misrepresents vast swaths of the world; both in weather and climate.

Slipstick

ATheoK,
Actually, you are describing weather. What I am referring to is using all measurements from all instruments in a region, or over the entire Earth, over a significant period of time, to produce a single average.
To your point, since the weather in a region is moving, if you average thousands of measurements from that minimal number of instruments in the 1200 km swath, you will have an accurate representation of the average in that region.

Stevan M Reddish

Slipstick June 19, 2017 at 10:31 pm
You seem to be thinking of one factor of weather over a homogeneous region, such as precipitation from thunderstorms passing over the state of Oklahoma. Each storm would only drop rain along a narrow strip, but over time all parts of the state might get passed over by an approximately equal number of storms. In this situation, measurements of rainfall made at different stations might over time have similar totals. In this case, what would be gained by averaging? 5+5 /2 = 5 No information is gained.
ATheoK is thinking of a place such as the state of Oregon, which is divided by a large mountain range. The typical weather pattern is moisture carried East from the Pacific by the prevailing winds is precipitated primarily on the west slopes of the Cascades. Land to the east of the Cascades typically receives 1/6 the rainfall of land to the west of the Cascades. In this case, what would be gained by averaging? 60+10 /2 = 35 This is like averaging apples and oranges, or calculating the average sex of a local. Only misinformation results.
SR

Stevan M Reddish

Ha Ha Ha – I meant to say “locale”
SR

Specious claims slipstick!
You assume gross aggregation of numbers can be divided by sources and the result is an “average”.
Meaning that you did not seriously read Kip’s article above and are inventing strawman to distract from Kip’s points.
Nor are temperature averages true averages.
They are akin to saying something like:
• My child’s average height is.
• The average mountain is.
• The moon’s average distance is.
• The average cow is XX tall.
• The average alarmist’s education is.
• The average whale has xx calves…
What is being claimed as individual temperature measurements is only valid for one very small location. Whereas near conditions can be quite different
• A temperature six miles away, in an urban area is higher.
• A temperature 330 meters higher or lower is a different temperature.
And completely ignores:
• Winter temperatures are much different than summer temperatures.
• Night temperatures, morning temperatures, evening temperatures.
• Temperatures when the wind changes direction.
Just what does “average” temperature convey?
• That someone can make mud out of dirt?
• That someone can add and divide? Anything?
This is before taking into account, ongoing absurd adjustments without full metadata!
• Well, before understanding that temperature sensors are not identical, in most cases they are not similar!
• well. before understanding that temperature sensors are not tested and certified in the field,
• temperature sensors are not run in parallel before replacing,
• sensor stations are not equivalent in any sense of the word,
• sensor stations are improperly positioned and local maintenance ignores station placement,
• sensor stations are frequently infested with wildlife,
etc etc etc.
The litany of abuses summed into temperatures makes accurate temperatures farcical. Summing and dividing bad numbers does not eliminate nor correct for egregious behavior, assumptions or false claims.
You may call such abuses “averages”, that does not make said numerical abuses valid nor representative of temperature.

Douglas

In 1995 we moved from Andover, MA to Fort Wayne, IN — working at virtually the same salary as before. It was amazing how much further the same salary went at the midwestern cost-of-living we now enjoyed. The rents were much lower for exactly the same type of neighborhood, the auto insurance was cut in half, and on and on. I really think your income comparison should be weighted or at least analyzed in terms of cost of living when talking about salaries in different parts of the country.

Also disposable income. Not going to repeat my own rant – see above…

Rick C PE

Kip: I think you are providing a valuable service with this series. I fully agree that the abundance of software available to run sophisticated data analysis such as ANalysis Of VAriance (ANOVA) has resulted in a focus on the numbers – p-values in particular – without bothering to look at the actual data or understanding what it tells us.
I was a chief engineer in an independent testing lab for many years and trained engineers and technicians in measurement methods and data analysis. I taught that the first thing to do with a data set is plot it in any way you can think of that is meaningful – bar chart, histogram, x,y scatter, line graph etc. A good visualization of the data often tells much more than you can see just looking at numbers. I also taught that reporting an average without including information on the dispersion of individual values ( e.g. Standard Deviation, range, coefficient of variance, etc.) is statistical malpractice. So is not including a properly determine statement of uncertainty.
Averages are the ultimate data reduction tool. They often reduce an interesting set of data to a relatively meaningless single number.
Now, as to finding trends in long term time series data consisting of averages of data collected by varying methods and differing sampling schemes – that’s a much bigger can of worms. I suspect my personal hero, W. Edwards Deming, would look at what has been going on in climate science and weep.

ChasMac

As noted several times here, averages often have little value in describing actual conditions. My favorite example: the average human being has one ovary and one testicle.
Always consider averages skeptically when they are claimed describe the real world.

Brett Keane

@ ChasMac
June 19, 2017 at 7:51 pm: probably fewer….grin

Climate averages of temperature are too non linear to be meaningful considering that it takes exponentially more accumulated ‘forcing’ to maintain ever increasing temperatures owing to ever increasing emissions and the wide range of surface temperatures found across the planet. Surface emissions derived from temperature using the SB Law exhibit the property of superposition allowing meaningful averages of emissions when calculated across a whole number of periods, which in a climatic sense must be a whole number of years. Converting the average surface emissions into an average surface temperature is a meaningful metric in the same way that the planet average post albedo incident solar power of 240 W/m^2 (Pi) and corresponding average emissions of 240 W/m^2 of (Po) has an EQUIVALENT average temperature of about 255K.
Superposition arises from the simple, linear, COE conforming differential equation that relates the instantaneous incident power, Pi(t) and the instantaneous outgoing power, Po(t) with the change in solar energy stored by the planet, dE(t)/dt, such that, Pi(t) = Po(t) + dE(t)/dt. E(t) is the solar energy stored by the planet and dE(t)/dt is its rate of change, which when positive, the planet warms and when negative, the planet cools. The surface temperature is linearly related to E (whose units are Joules), but not dE/dt (whose units are power), which mathematically is what the IPCC calls forcing and whose steady state value is defined to be zero. The property of superposition arises because COE tells us that any Joule is interchangeable with any other Joule and that if 1 Joule can do X amount of work, 2 Joules can do twice the work. This should be obvious since the units of work are Joules and it takes work to warm the surface.
This DE gets more interesting when we define an amount of time tau, where all of E would be emitted at the rate Po and we can rewrite the equation as Po = E/tau + dE/dt, which any EE student will instantly recognize as the form of the Linear, Time Invariant differential equation that describes the non linear charging and discharging of an RC circuit, where tau is the time constant.
If Ps(t) is the SB emissions of a surface at a temperature of Ts(t) and we know that Ts(t) is linearly proportional to E(t) and we can easily quantify the average NET surface to space transmittance, which includes surface energy absorbed and re-emitted by the clouds and GHG’s, as an effective emissivity of an emitting surface at Ts(t) whose emissions are Po(t), we can mathematically ascertain the sensitivity. When you simulate the equations including the relative relationships between Po(t), Ps(t), Ts(t) and E(t), the average sensitivity, that is the average change in Ts(t) per change in dE(t)/dt turns out to be less than the lower limit claimed by the IPCC. The quantifiable effect doubling CO2 concentrations has is on the surface to space transmittance which decreases by about 1.5% when CO2 doubles and is what the IPCC claims is EQUIVALENT to 3.7 W/m^2 of dE/dt while keeping the NET surface to space transmittance constant. That is, instantaneously doubling CO2 instantaneously decreases Po by 3.7 W/m^2 which is indistinguishable from increasing Pi by the same amount.
The point of this is that for a climate system whose behavior can be expressed in terms of Joules and COE, averages and their relative changes are meaningful metrics and the fundamental reason is the linear behavior that arises when you require any one Joule to be interchangeable with any other one.

Paul Penrose

While I mostly agree with what you say, the problem is that argument assumes all the energy absorbed at the surface is transported to the TOA by radiation. This is simply not true. In a gas energy is also transported by conduction and convection. This is important because the CO2-energy-trapping mechanism is radiative in nature. So to the extent that some of the energy at the surface is transported via other mechanisms, there is less available for CO2 to “trap” in the lower atmosphere. Also, the results of the models are always reported in terms of temperature change, but converting from energy to temperature is not trival as it depends on a number of factors, probably dominated by water vapor content. How do they handle that? Seems like a good place to “tune” the output.
As far as averages go, they can be useful and meaningful to some degree, but keep mind that we are talking about a data reduction method. You may gain insight into one particular metric, but at the expense of all the others. This is what is not generally understood and the source of many misconceptions/deceptions.

Paul,
” …assumes all the energy absorbed at the surface is transported to the TOA by radiation”
This is not the case. It assumes that energy transported into the atmosphere by non radiative means plus its return to the surface, that is energy transported by matter, has a zero sum influence on the NET photons leaving the planet, thus the NET incident energy required to sustain the surface temperature, since whatever effect the non radiative transport in and out of the atmosphere has, its already accounted for by the resulting temperature of the surface and its consequential photon emissions of BB radiation. Trenberth muddies the waters by lumping in the return of non radiant energy entering the atmosphere as ‘back radiation’, but it is clearer of you subtract out convection/thermals and latent heat from the back radiation term.
Converting energy to temperature (actually power) can be done trivially with the Stefan-Boltzmann Law. The complications arise from the atmosphere which fundamentally turns a black body surface emitter into a gray body planet emitter, but the emitting surface itself that is the virtual surface in direct equilibrium with the Sun, can be considered an ideal BB radiator and often is when processing satellite data, where the temperature of this radiator is an approximation for the actual temperature. Measurements show it to be within a few percent, moreover; it tracks deltas to an even higher accuracy.

Excellent article Kip.
And some well stated clarification specifics; e.g. commiebob and Steve Case.
Chippewa bloodline here.
Consider a normal person or any citizen.
As a person they are unique, identifiable and bring specific knowledge, talent and skills.
Averaged, as human face experts study, the person is no longer identifiable. Nor are the concepts of knowledge, talent or skills viable when people are averaged.
There is a point, which social science people love, that when enough people are aggregated into a large population; that is when sociologists can “predict” behaviors, prejudices and biases.
Yet that is still not an average. Amassing sufficient persons into large populations sums up their individual knowledge, skillsets and general mental conditions is not the same as an average.
Take the group, divide by population and achieve per capita average that very few individuals resemble, even faintly.
Averages are vague fuzzy entities. Sports statistics utilize averages extensively; but fantasy sports players will be the first to recognize that statistics fail to reveal the current situation. Exact knowledge of how when and why a person achieved their sports statistics is essential.
Alleged averages concocted from disparate unique and separate sensor sources are not averages!
Attempts to average unique separate sources requires;
• a) Identical sensors. Same batch, same certification, same verification procedures post installation!.
• b) Evenly spaced grid.
• c) All altitudes, latitudes and longitudes are equally represented
• d) Replacement sensors are run parallel sufficiently long for proper verification.
• e) Infestations or other contamination are reasons to reject affected data! Never adjust!

Here’s another thought for your household income maps and poverty. The U.S. Census Bureau calculates poverty based on a 50+ year-old study that takes the cost of a subsistence diet and multiplies it by three with no allowance for regional cost of living differences. Thus, California, with a cost of living about 50% more expensive than Texas, has the same poverty level as the Lone Star State. While Alaska and Hawaii are given more generous poverty level thresholds, the contiguous 48 states have the same poverty level.
Census addressed this issue in their groundbreaking Supplemental Poverty Measure research which also accounted for the value of non-cash government assistance in determining income (in the traditional measure, food stamps–now called Supplemental Nutrition Assistance Program–and Section 8 housing vouchers are not considered income). The Supplemental Poverty Measure also accounts for taxes, out of pocket medical expenses and work-related expenses. These factors cause the poverty rates to soar in the costly Pacific coastal region and in the Northeast and Florida while reducing poverty rates in the low cost Midwest and South.
When all is said and done, California has consistently seen the highest Supplemental Poverty rates in the nation for the past several years.

K.kilty

The distribution of wealth by county in wyoming tells an interesting story if one knows quite a lot else about the state. The statement that wealthy persons have retired to wyoming explains the high income in Teton county only. It is very, very wealthy and votes overwhelmingly Democratic. Most of western wyoming has a large Mormon population and employment in the energy industry. Campbell county in the north central is energy employment–oil at one time but now mainly coal. With so much wealth and income connected to energy it is not difficult to understand why the state, as a whole, voted so heavily against Hillary.

Robber

Well presented. Now for the challenge of applying it to temperature data. I always find it amusing when I read that the average global temperature is 15 degrees C. Is there a spot on earth that is perfect because it is average?
February is the hottest month in Singapore with an average temperature of 27°C and the coldest is January at 26°C.
July is the hottest month in Los Angeles with an average temperature of 22°C and the coldest is January at 13°C, so a yearly average of about 17.5°C.
January is the hottest month in Melbourne (my home) with an average temperature of 21°C and the coldest is July at 10°C, so yearly average is about 15.5°C, almost matching the global annual average.
July is the hottest month in New York with an average temperature of 25°C and the coldest is January at 2°C, so a yearly average about 13.5°C.
The average temperature in London is 19°C in July and 5 C in January, so a yearly average of about 12°C.
July is the hottest month in Moscow with an average temperature of 19°C and the coldest is January at -8°C, so yearly average about 5.5°C. Why do people live there? No wonder the Russians might be in favor of some warming.
With all of these variations, we are then told that the average global temperature has risen by about 1°C in the last 100 years, and a further 1°C rise will be catastrophic. For who? Where?

seaice1

“Is there a spot on earth that is perfect because it is average?” Why should average be perfect? In almost every case average is not perfect.
Your examination of populations is apt. People tend to be located in areas that are currently productive. If there were free migration of people we could probably adapt reasonable well to changes. Siberia gets more productive, so people move there from areas that have become less productive. However, given our current set up of nation states with very little will to allow immigrants this adaptation seems unlikely to happen.

I have a more fundamental objection to the averaging of temperature data. SB Law is that P=5.67*10^-8*T^4 with P in watts per square meter and T in degrees Kelvin. Doing the math, it takes 3.3 w/m2 to raise the temperature from -30 to -29 deg C. But it takes 7.0 w/m2 to raise the temperature from +30 to +31 degrees C.
Calculating a global temperature average to try and track changes in earth’s energy balance due to increasing CO2 is patently absurd. All that much more absurd when one considers that the people who do these calcs have enormous compute power available to them. Taking all that temperature data, raising it to the power of 4, and THEN averaging it might produce something useful, and would be trivial for them to do.

For the umpteeneth time, they don’t average temperatures. They average anomalies. And they don’t use it to try to track changes in earth’s energy balance. For that you do have to track the temperature of the region radiating. Most of that is not at surface anyway.

TimTheToolMan

Nick writes

They average anomalies. And they don’t use it to try to track changes in earth’s energy balance.

Not true. They (try to) measure the temperatures of the ocean and from that derive the accumulated energy and hence energy balance.

pbweather

How do they get the normals to derive the anomalies that they then average? Seems a circular argument to me.

Nick Stokes June 20, 2017 at 12:38 am
For the umpteeneth time, they don’t average temperatures. They average anomalies.

And for the umpteenth time, there is no difference. An anomaly of 1 deg from -30 is 3.3 w/m2 and an anomaly of 1 degree from +30 is 7.0 w/m2. Averaging anomalies is the exact same error as averaging temperatures and for the EXACT same reason
And they don’t use it to try to track changes in earth’s energy balance. For that you do have to track the temperature of the region radiating. Most of that is not at surface anyway.
That is EXACTLY what averaged temperatures and anomalies are used for. The press is full of quotes from various bodies screaming about the need to keep earth’s temperature rise below the “dangerous” threshold of 2 degrees over pre-industrial.

Kip Hansen

Nick ==> For heaven’s sake — Mosher says they don’t average any temperature but predict temperature that one would find at unmeasured locations and then just pretend that they have produced an average — you claim that “they” don’t average temperatures at all — they average temperature anomalies — if I asked you “anomalies of what?” you would have to truthfully answer “Anomalies of averaged temperatures.” There is no anomaly without an average to be an anomaly from!

Nick Stokes, “… they don’t average temperatures. They average anomalies. ”
1, Are you saying a particular station’s monthly average is
The Total of each days Tmax for the month minus the total of each days Tmin for the month divided by 2 and the base period is calculated the same way; and the recorded anomaly is the current calculated monthly average minus the base period’s calculated monthly average ?
2 Are the grids whether they be 5 X 5, 3 X 3 or 5 X 3 weighted to correct for varying surface areas at differing latitudes when they are averaged?

Clyde Spencer

NS,
In order to compute an anomaly for a particular station, one has to compute an historical average first. Thus, temperatures ARE being averaged, contrary to your claim! If that historical average (baseline) has a standard deviation of tens of degrees, what is the precision of resulting anomalies?

Kip,
“you would have to truthfully answer “Anomalies of averaged temperatures.””
Absolutely not. That is the fundmental difference that almost everyone misses. You must form the anomalies first and then average. You get a quite different, and wrong, answer if you average temperatures in a region over a period and then take anomaly. That is the homogeneity thing. You don’t have the same stations in each month.
Yes, people aren’t always carefully about saying that it is an anomaly average. But well-informed people know what they mean.

Davidm,
“That is EXACTLY what averaged temperatures and anomalies are used for. The press is full of quotes from various bodies screaming about the need to keep earth’s temperature rise … “
They aren’t talking about radiative balance. They are just talking about the consequences of getting hot.
OLR isn’t simply determined by surface temperature. Here is a map. There is obviously correlation, but it isn’t tight. Australia is not really hotter than Brazil (year-round).comment image
On anomalies, yes, you could weight them by T^3 to get the T^4 effect. It would just mean something different. Even locally – the annual average would be much more like average summer temperature.

Kip Hansen

Nick ==> Unless you have some way to reverse the arrow of cause and the arrow of time, I do not see how it is possible to… “You must form the anomalies first and then average.” Between you and Mosher, I am beginning to feel I’ve woken up on Bizzaro World.
According to NOAA NCDC
“In climate change studies, temperature anomalies are more important than absolute temperature. A temperature anomaly is the difference from an average, or baseline, temperature. The baseline temperature is typically computed by averaging 30 or more years of temperature data.”
(link = https://www.ncdc.noaa.gov/monitoring-references/dyk/anomalies-vs-temperature )
This is as I thought, to find an anomaly t6o [to] average, one must first find anomalies by finding the difference between a temperature and its 30-year average as a baseline.
Can you explain your statement “”You must form the anomalies first and then average.”” in light of this verity clear definition?

seaice1

Just a thought with the baseball analogy that came up earlier. However I know nothing about baseball, so please work with me.
You have 100 bowlers, you want to see if something is affecting their performance.
You take their average so far, then calculate an anomaly from their last game – that is the difference between their score in the last game and their average in the season up to then.
If we just averaged the scores before and after the result would be dominated by the high scorers. But if we take the anomalies then average we get a more meaningful result. We are then looking at the change rather than the absolute amount.
I am not sure about this but would welcome a comment from Nick to explain if these situations are remotely analagous and it might help me and other to understand.

Nick Stokes
They aren’t talking about radiative balance.
First of all Nick, you haven’t responded to my point about averaging anomalies being just as insane as averaging temperatures. It is the exact same problem. You cannot toss it off with a few words about getting “something else”. It doesn’t matter if you get something else or not, what matters is that averaging temperatures or anomalies is ridiculous.
As for not representing earth’s energy balance, sorry, but the profile of emission and absorption from TOA to surface IS part of the energy balance. How it changes affects temperatures where we live. Complaining that energy balance means exclusively energy exchange between earth system as a whole and outer space is a ridiculous interpretation of what I said. Substitute whatever terminology you want, at day’s end the mantra is look, the temps/anomalies are rising, it is because of CO2. The metric is meaningless.

Paul Jackson,
“recorded anomaly is the current calculated monthly average minus the base period’s calculated monthly average”
Yes. That is how the local anomaly is calculated, prior to spatial averaging.
And yes, in grid averaging, the cells are area-weighted (basically, cos latitude).

seaice
“would welcome a comment from Nick to explain if these situations are remotely analagous”
I know even less about baseball, but yes, I think so. Anomalies are useful for focussing on change. But the basic reason for using them is that when you calculate an average over time, the population of sites changes. Comparing the averages of different sets is OK providing they are homogeneous – ie equally (approx) representative of sites on earth. For absolute temperatures, that is not true. So you’d have to worry every month whether you had the right balance of hot and cold places (and you don’t have much control). But anomalies are calculated to be homogeneous. That is, you subtract out the expected value. It’s still possible that
1. you didn’t get that quite right, or
2. There are other differences, eg variance (heteroskedastic)
But you have made a huge improvement.
An example of the marginal issue is the Arctic. Subtracting the base period means takes out the issue that it is habitually cold. But it doesn’t take out the fact that it is warming faster than elsewhere. That is why Hadcrut, which has had weak Arctic data (improving), showed lagging recent trends, which increased with proper accounting, as Cowtan and Way showed.

Kip,
“Can you explain your statement “”You must form the anomalies first and then average.”” in light of this verity clear definition?”
Yes. The NOAA definition would have been clearer if they had specified site averages, but the following text makes it clear that that is what they mean. Calculate the site anomaly, and then aggregate. Sometimes it is slightly fuzzier, as when they calculate a cell anomaly rather than site, arguing that temperatures within a cell are homogeneous enough to average. But the text says they don’t even do that. Here is the diagram they use to illustrate why anomalies are used:comment image
Here is how they explain it:

Using anomalies also helps minimize problems when stations are added, removed, or missing from the monitoring network. The above diagram shows absolute temperatures (lines) for five neighboring stations, with the 2008 anomalies as symbols. Notice how all of the anomalies fit into a tiny range when compared to the absolute temperatures. Even if one station were removed from the record, the average anomaly would not change significantly, but the overall average temperature could change significantly depending on which station dropped out of the record. For example, if the coolest station (Mt. Mitchell) were removed from the record, the average absolute temperature would become significantly warmer. However, because its anomaly is similar to the neighboring stations, the average anomaly would change much less.

Of course, there is averaging within a site. Sub-day readings are aggregated to daily average. Those are averaged to a month (again, a bit of an assumption that within the month they are homogeneous, but accurate work will correct for this if there are missing values). That is part of the process of forming anomaly. But that has to be done before you start aggregating stations.

seaice1

Nick, I am not sure it is possible to know less about baseball than I do, but I basically understand cricket, so I will accept the possibility. I fairly recently learnt the word “heteroskestic” so I am always please when it comes up.
It seems to me that we would be dealing not just with our 100 baseball players, but a changing population of players. Some drop out and new ones come in. Using anomalies we can deal with this in a sensible way. Simple averages would make no sense. We would not be certain that we were not at any particuar point over-represented by good bowlers or bad bowlers. Does this make sense, or is this a mis-representation of the issue?

That NOAA image came out very small – hopefully this is easier to read:comment image

“For the umpteeneth time, they don’t average temperatures. They average anomalies. ”
Using the school class example, for the whole school, you would use anomalies from the average for the year level in a base period to avoid the variation in number of students in the different year levels. This still does nothing about a change in average due to changes in demographics and doesn’t make the changing average useful in estimating class height of the next generation.

I used an example a while back of a newspaper report on research that showed Australian women were getting fatter on average. About 2 kg. A break down in the change in population of groups based on weight and age showed that only the over 110 kg in the 45-60 group had a significant increase in number, doubling from the previous survey 12 years before.
The interpretation using the average was that women in general were piling on the pounds. The more detail version showed huge improvements in saving morbidly obese women from death.

Frank

davidmhoffer wrote: “And for the umpteenth time, there is no difference. An anomaly of 1 deg from -30 is 3.3 w/m2 and an anomaly of 1 degree from +30 is 7.0 w/m2. Averaging anomalies is the exact same error as averaging temperatures and for the EXACT same reason”.
This isn’t much of a problem, David. Call T the mean global temperature and dT the difference between a local temperature and mean global temperature. In your extreme example, T = 273 K and dT is +/- 30 K: If we use T to calculate outgoing radiation W:
W = oT^4
If 50% of the Earth were T+dT and 50% were T-dT:^4
W’ = 0.5*o(T+dt)^4 + 0.5*o(T-dt)^4
W’ = oT^4 + 6*oT^2t^2 + ot^4
The proper average W’ is slightly bigger. Calculate the ratio of these numbers:
W’/W = 1 + 6*o(t/T)^2 + o(t/T)^4
T is obviously much bigger than t, so we can ignore the last term. In your example t/T is about 1/9 and 6*o(t/T)^2 is 2/27, a 7% error.
In the real world, T is 288 K and maximum dT is about 13 K over oceans (70% of the surface). Except for Antarctica and some extreme locations during winter, maximum dT over land is usually 20 K. The average dT is less than these extremes. Think about averaging the error term 6*o(t/T)^2 over the whole planet over a whole year. Maybe the average dT difference from 288 K is 10 K. I get an error term of about 0.7%.
So the problem of using (average T)^4 in place of the correct average(T^4) is fairly small.
In reality, most photons reaching space are emitted from the atmosphere high above the surface. Temperature variations with latitude are smaller there, but variation with altitude is bigger. You need the radiation module from a GCM to do these calculations properly. And we have satellite measurements to check these calculations. For the purposes of WUWT, the error in using mean global temperature to discuss mean global emission of LWR is almost always unimportant.

Geoff Sherrington

Dmh,
That is very good point, one I have also used.
When you think about it, you wonder if such slack can be, and is, accounted for adequately in GCMs.
Geoff

ossqss

You are so right Kip on statistical averages. One can look at employment, durable goods, GDP and on and on. The tales can be told from many perspectives dependent upon positions taken on such. Ultimately, the real numbers will rule in the end.
How would sea ice averages be viewed if we took a starting point of 1974 as an example?
Just sayin….

Clyde Spencer

Kip,
I have one small quibble with an otherwise excellent article. You said, “…believed by their creators and promoters to actually produce a single-number answer, an average, ACCURATE to hundredths or thousandths of a degree or fractional millimeters.” I believe that “precise” would be a better choice of words than “accurate.” The issue of the accuracy of the average is another question entirely, more related to the sampling and interpolation protocols.

“The high temperature today was 104. That is 23 degrees warmer than the normal of 81.” You can hear that on TV any day. First of all there is no such thing as a Normal temperature. Who can say what is normal? Nobody. The number they are calling normal is in fact an average of the high temperature on that date for the 30 year period ending at the beginning of the current decade as calculated and published by The National Weather Service. It is a short term average which is somewhat meaningless from a climatic point of view. Such is life.

Kurt

While we’re on this subject, can anyone identify what useful information is conveyed by averaging the results of different computer models? Let’s say I know almost nothing about hockey (a completely true assumption) but that I’m given $100 to bet on an upcoming hockey game. Knowing nothing about hockey, I look online and see that “X” sports analyst is predicting team 1 to win by 2 goals, “Y” sports analyst is predicting team 1 to win by 1 goal, and “Z” sports analyst is predicting Team 2 to win by 1 goal.
The average of the three predictions is team 1 by 2/3 of a goal, and I see that the spread is 1 goal, so I bet on team 2 to beat the spread. But, as far as I can deduce, the information conveyed by the average relates to the predictions of the game – not the actual results of the game, which is a unique event not amenable to a mathematically probabalistic analysis. What I’m really measuring is the expected value of a next sequential prediction by another analyst. I’m not measuring anything related to the actual event, i.e. the singular game to be played in the future.
That I resort to taking such an average highlights, not my expertise in hockey (never that) or even my mathematical aptitude since I’ve come up with completely useless information. What it illustrates is my ignorance; because I lack the ability to look at the teams and their past performance and make my own judgment, I have to resort to a simple average of predictions of people I presume to be experts.
So given all that, what am I to conclude about the “expertise” of the IPCC when, faced with widely differing scenarios presented by different models developed by different groups of people with different assumptions about the inner workings of the climate, the IPCC can’t just pick the best and instead just averages the results?

Don K

Kurt
I’m smart enough to think I understand your analysis, but not smart enough to know if it is correct. However, it does seem to me that it would apply equally well to the stock market, to most economic analyses, and to a wide variety of other things. Are you trying to destroy civilization as we know it?

Kurt

I’m not that grandiose. I have a hard enough time trying to destroy the weeds that keep infesting my lawn, so I just try to smite the little stuff.
Not sure what you mean by the “stock market” or “economic analyses.” The DJIA is an average share price of a specified basket of stocks, presumably indicative of the market as a whole, so the question is whether the selection of what’s in the basket is a sufficiently representative sample of everything, like polling people and trying to get a demographically representative sample.
My issues relate to the narrow question of what you are sampling. If you assume that the DJIA is indeed representative of the entire universe of stocks in the NYSE, then you should be able to take another representative sample of different stocks and come up with the same average number. If you assume that your polling methodology is accurate, a new poll should produce similar results even if the people change. The average relates to the common feature of your samples
If I sample results from a single model, I can at least get my mind around the idea that the results show the expected behavior of the theoretical climate system that the model represents, which can be compared to reality as a kind of benchmark for whether my model accurately simulates the climate’s behavior. But if I start averaging together the output of different models, each model representing different theories of how the climate system works, that average doesn’t represent any useful metric. I’m sampling from among different, mutually exclusive theories about how something might work.
Let’s assume for example that the “average” behavior of the models happens to line up very well with the way the climate actually behaves in the years subsequent to the model runs, but none of the individual model averages do. If individually, none of the models got it right, then I can’t have confidence in the set of assumptions made by any individual one of the models. If I can’t have confidence in the ability of any individual model to accurately simulate the climate, what is the point of sampling them to begin with?
The polling example may provide the best analogy. Real Clear Politics averages multiple polls, each with different sampling and weighting methodologies, but the polls often have widely disparate results that can’t all be true. Averaging them together shouldn’t give you any better information about the opinions of the electorate at any given point in time. Possibly, if all of them are moving in the same direction over time, you can infer that, regardless of which poll is most accurately samples the electorate as a whole, one candidate is gaining steam and the other is not. But the average tells you nothing, and if one poll contradicts the trend of all the others, how do you know that that one poll isn’t the one that is doing the correct sampling, without exercising independent expertise as to why it should be treated as an outlier?

Don K

Kurt — I was thinking of stock market analyses, not the DJIA/S&P/Nikkei per se.
But you are aware that the DJIA is a moving target? Companies are added and subtracted. I think that the only remaining member of the original 1896 DJIA is GE. Most (all?) of the rest are not only no longer in the index. They are mostly defunct.although some still exist after a fashion e.g. US Rubber ended up as a small part of Michelin.

Kurt

“But you are aware that the DJIA is a moving target? Companies are added and subtracted.”
I’m not sure that’s necessarily a problem. Two different opinion polls taken in consecutive weeks will sample different people, but that doesn’t mean that each poll is not representative of the entire population at the time it was taken,

Clyde Spencer

Kurt,
It seems to me that with an ensemble, one can logically only have one ‘best’ result (barring duplicates). Thus, averaging the best result with the inferior results will give one something in between the best and worst. Is that useful? Probably not as useful as the best result. By not determining which are the good and poor results, no insight is gained on what contributes to the quality or utility of the different models.

Kurt

But that’s true with calling a coin toss heads or tails, as well. Only one result is actually going to happen, but that doesn’t mean that the “average” of the group of all possible outcomes isn’t useful.
For simplicity, assume that health insurance only insures against lung cancer. Insurer’s know that a smoker will either get lung cancer , or not. They assess the probability of a particular insured getting cancer in the policy term (usually pretty low, even if you are a smoker) and multiply it by the expected cost if cancer is contracted, to arrive at their “expected” cost. They add a “premium” on top of that for their profit, and bill the insured. The “expected cost” in this example will never happen. If a person gets cancer, the costs to the insurer will vastly exceed the “expected cost.” If the person does not get cancer, the costs are zero.
If the insurer makes this calculation for a very large number of people, accurately assessing the risks of each person and charging them the appropriate amount for their policy, then the insurer can probably pre-calculate how much profit it will make for the whole group, but for each insured, the proper, “expected” result does not correspond with the actual outcome.
Now Kip, above, says that averaging chaotic results retains no useful information, and my gut instinct of agrees with that, though I think it’s more a question of how you assign a respective probability to each of the sets of initial conditions. I think the modelers treat them all as equally probable, but that has to be an assumption made of convenience (i,e. it’s all they can do) rather than being a reasoned assumption.

Don K

Not that it matters, but “An interest of mine, as my father and his 10 brothers and sisters where born on the Pine Ridge in southwestern South Dakota, the red oval.” “where” surely should be “were”.

Don K

Note on the Normal Curve/Gaussian distribution. Statisticians love the Normal Distribution because of its nifty mathematical properties. But AFAICS hardly anything on the real world other than (usually) the Standard Error of the Mean actually distributes normally. If I Recall Correctly, the poster child for the normal distribution — the Intelligence Quotient curve actually required a substantial adjustment at one point to make the numbers coming out of testing more closely match the theoretical distribution.

Here is a real world Gaussian example:
https://m.youtube.com/watch?v=1WAhTdWErrU

Don K

Hmmmm. You ever actually measured a batch of resistors? Me either. But I wouldn’t assume that the results will actually be Gaussian. May well depend on the manufacturing process — e.g. tested and trimmed before encapsulation vs build then test and sort. See discussion at https://electronics.stackexchange.com/questions/157788/resistors-binning-and-weird-distributions.

Mark.r

Here so this June we have had 12 days colder than normal and 7 days warmer with the months avg temp so far 0.2c warmer than normal.

Don K

Two general comments:
1. I think it’s OK and often useful to add apples to oranges. You just need to remember that the units of the result will be “fruit”. Not hard in that case. But sometimes the distinctions are a lot more subtle.
2. Not all useful numbers have a sound physical basis. Example — Sunspot numbers. If I understand them correctly, they are a rather complicated index, not a count. That’s why we never see sunspot numbers less than 10. Nonetheless, they do seem to correlate to solar activity and are said to be useful. What about statistical operations on sunspot numbers. Are those operations well behaved? Meaningful?

jonesingforozone

Sunspot numbers are an asymptotic proxy for solar activity, i.e., the sunspot number has a lower limit of zero while other measures solar activity are decidedly not zero. Using stars other than our own, Astrostatistical Analysis in Solar and Stellar Physics concludes, in part, “We find that incorporating multiple proxies reveals important features of the solar cycle that are missed when the model is fit using only the sunspot numbers.”

Peta from Cumbria, now Newark

It all gives me brain-ache yet still its a ‘brain-worm’ I cannot shake off..
There’s something pretty epic in it though – an ‘instinct’ or ‘intuition’ tells me so and its how people interact with these ‘averages’.
What’s got me going is the flood discussion.
Lets say A River, somewhere anywhere has an average flow-rate or (more easily measured by people) an average depth .
That average could be decades or centuries long.
Lets look at the River Mild and its been 5 feet deep under the bridge at Boringville for the last 250 years. Fine.
But because we’re doing averages, that means and by definition, unusual things happen.
So, after some cold wet weather on River Mild’s catchment, the sun comes out, temperatures soar and thunderstorms break out. Perfectly feasible as the T storms are fed by all the recent wet weather.
And suddenly, on a Tuesday and 3 hours after midnight, the River Mild rises by 20 feet and its flow rate goes up 30 fold. Because of the T storms.
Because lots of folks thought it was really a River Mild, they built houses next to it. They used ‘The Average’
But for 6 or 7 hours it turned it into River Wild.
It cut a vast swathe of muddy devastation through Boringville, destroyed homes, gardens, fields and a great deal of the people themselves.
Then for the next 250+ years it returned to being River Mild.
That single one-off event damaged a great number of people but when Ivory Tower Dwellers, at public expense come to work out The Average, that (flash) flood completely vanishes.
So the people rebuild their houses and the High Street at Boringville and then what happens?
10 years late another freak flood arrives. Similar thing to playing The National Lottery.
(Not unlike Carlisle, on the River Eden. In Cumbria.) (The flood, not the lottery)
So where are your averages then. What use did they do you in either Carlisle or Boringville. Was there *really* any point to working them out. Are they not just even more faerie counting?
Did they lull a false sense of security or, as they’re equally validly for averages, a false sense of doom and disaster? As per climate science right now.
How do you/me/anyone connect those numbers with The Real World and Real People.
Maybe we could start by opening the doors and windows of a few Ivory Towers and kick the residents out.
I think I’ve worked it out here and now. There are too many Moshers and the like.
Over to you Donald……..

Don K

Nassim Nicholas Taleb writes entertaining and possibly insightful books about extreme events and our inability to think clearly about them — The Black Swan, Fooled by Randomness. Possibly worth reading if you haven’t encountered them.

Clyde Spencer

Peta,
Your story illustrates why skewness and variance need to be provided along with a mean. Also, providing mode and median is informative. Focusing on mean alone is either being deceptive purposely, or illustrating statistical ignorance.

Kip,
I really don’t see the point of this loquacious dumbing down of already elementary statistics. Especially if inaccurate – the mean is an average, but the mode and median are not. They are all measures of central tendency.
I really can’t see the point of your second example. Of course the average, or any other summary statistic, like median, can’t tell you all about the dataset. That is why the various authorities present all that other information that you quote – all subset averages. I think the Dow is a useful average to think about. It is very widely quoted and used. It won’t tell you whether copper stocks are booming, or taxis have collapsed. Everyone knows that it won’t, but they still find it useful.
Your first example does again illustrate the use of averaging as an estimate of population mean, though you don’t seem to see it that way. It probably isn’t an average being sought, but a bottom quintile, or some other summary statistic, but the principle is the same. What you want is an estimate of the average of the population of boys for whom the bar is intended. The data for Mrs Larsen’s class is the sample you have available, at least initially. And you do need that summary statistic. The boys may vary, but the bar can only have one height.
Sampling is a science, and you need to get it right. As you say, it might happen that her class is not representative. That is not a problem with average calculation; it is a problem of sampling. And there are remedies. Maybe Mrs Larsen’s class had an unusually large number of Hispanics, and maybe they tend to be shorter (I don’t know that that is true). So you re-weight according to what you know about the population. It is that knowledge that you need to design the bar, plus someone who actually knows about statistics.

TimTheToolMan

Nick writes <blockquote<Of course the average, or any other summary statistic, like median, can’t tell you all about the dataset.
Or indeed what a changing average actually means.

“Or indeed what a changing average actually means.”
Yes, you have to figure it out. Like when the Dow drops 2000 pts. It’s not obvious why, but plenty of people are going to be curious. It matters to them.

TimTheToolMan

Sorry we only model to the market level. We dont go down the the stock level and when we try, its not a great result. But I’m sure knowing what the market in general is doing will give you enough information to help you choose your portfolio.

Mariano Marini

A well written essay with many good points, but, unless you are arguing semantics, the sentence “The Earth is not a climatic area or climate region, the Earth has climate regions but is not one itself.” is false. All of the “climate regions” of the Earth are connected and continuously exchanging energy and are, on average, one climate system.

I’m not trying to answer this question but just trying to clarify. My English is not so good so I beg your pardon in advance.
If we think at “Local Climate” as a class, could we say that “Global Climate” being a SuperClass of LC is a LC Class it self? Let define the Members of LC Class: “Temperature, Air pressure, eccetera”. Can we find such a members in the GC Class?
If not then GC is not a Climate Class but a name for the avarage Climate Class instances!

marianomarini

In other words. An engine is a system witch part are well connected and exchanging energy but can we build an engine of engines?

Leo Smith

Hmm. Nice one
In fact this misuse of statistics is in fact part of a wider almost philosophical problem that lies behind nearly all of the problems the not so modern mind seems to have in dealing with the complexity of life as it really is, rather than in the idealised and simplified pictures that are all we seem capable of.
Science itself is just such and idealised and simplified picture. And there will always be a compromise between ‘idealised and simplified to the point of error’ and ‘so complex we can’t compute it anyway’.
Unfortunately my message, that in climate science these two areas overlap massively, is unwelcome by alarmists and skeptics alike.
Everyone wants to seek out the One True Cause of climate change.
The message that they never will, is not desired.

Kip
An excellent post and a good reminder of the limitations of using averages for analysis.
I do have a quibble but it may add to your points. You have used Household and Family incomes interchangeably. The Census Bureau has a very precise definition of both terms. There are about 125 million Households and about 80 million Families in their reports. When using either term the results will give you different data, much like mean versus median and nominal versus real. Even the word “income” can mean different things, market versus aggregate.
To reinforce your point, when looking at the data for income over a 50 or 60 year period, there are changes in demographics etc that can alter the meaning of any comparisons over decades. For instance, the growth in real income of a family with two earners has been greater than those of single households. Why? Because there is a positive correlation between marriage rates and education and thus income. Also, there are many more Households with a single individual today than 50 years ago. The proportion of Single Mom families has grown substantially and the disparity in income between that unit and the two income earner families has grown with it. The real story sometimes is down in the weeds and each piece has to be dissected to understand what is really going on.

Bill Marsh

If you really want to have fun, try having a discussion with an ‘average’ person about percentiles. I once tried (and failed) to convince a young Physicians Assistant that I could not be in the 100th percentile.

Really, isn’t the 100th percentile the value equal to the maximum value in a distribution?

Really, isn’t the 100th percentile the value equal to the maximum value in a distribution?” Phil.

If we were talking about intelligence, he would have to be smarter than himself to be in the 100th percentile.

blcjr

Good observations, but I wish you could have been as detailed in illustrating the problem with “climate” data as you were able to illustrate with income data.
One simple example I sometimes use to illustrate the uselessness of “average temperature” is to talk about weather patterns in my region (central Arkansas). Weather sources will routinely report that the “average” temperature for a particular day is such and such a number. But in reality I would expect it to be bi-modal, especially in the warmer months. We are in a region where the weather is determined by fronts “passing through” the state. On either side of the front the temperature will vary significantly. So one day the high is, say 90, and the next day it is 80. “Statistics” will say that the “average” temperature “for this time of year” is 75 [85], when in fact it is almost always higher or lower. “Mode” is a better measure of central tendency, and would show the pattern to be bi-modal.
Someone upthread gave the example of Oregon’s “average rainfall.” Again, meaningless, since the regional climate is so different on either side of the Cascades. Climate is at best a regional concept, and even then simple averages can be misleading because even within a well defined climate region averages will vary over time because weather patterns (and long term changes in weather patterns, aka “climate”) are dynamic.
I’m no expert on climate “models” but do they even try to reflect the dynamics of weather/climate change or are they essentially static “models” for discrete periods of time?

Alan McIntire

One of the more exotic types was the geometric mean. Here’s an example I picked up reading articles by Isaac Asimov in “Fantasy and Science Fiction” magazine. He was comparing the size of humans to blue whales and mice. Are we bigger or smaller than the average mammal?
Here are their average masses:
10^5 kg blue whale
70 kg human
3*10^-2 kg mouse
An arithmetic mean would give just about 1/3 of 10^5 kg- most mammals are WAY below average in
mean mass. A more reasonable comparison is the geometric mean, which would be
(10^5 kg*70 kg* 3*10^-2 kg)^0.3333333 which gives 59.44 as the geometric mean. Humans are bigger compared to mice than blue whales are compared to humans.

Leo Smith

When looking hard at renewable energy it occurred to me that in programming a computer, in calculating the cost and benefit of an aircraft design and indeed when looking at putting windmill on a grid, the income is derived from the average performance of the engineered solution, but the cost is dominated by worst case provisions.
Perhaps that is a subject worthy of an essay.
(anyway its impact on renewable energy is massive: the worst case of renewable energy is its generating nothing and the cost of covering for that exceeds all the value in the renewable solution).

pouncer

Paul Krugman walks into the bar, Cheers …
Per capita bar patron income goes up. Income inequality among bar patrons gets worse. Median net worth is higher. The total amount of taxes paid by bar patrons increases. The likelihood a randomly selected patron is also a contributor to the New York Times opinion columns rises astronomically. The chance that a randomly selected patron voted Republican in the last presidential election decreases, slightly. The average “carbon footprint” among bar patrons rises measurably. The amount of hot air expended in conversations among Cliff, Norm, Fraiser, and now Paul contributes to global warming by a comical amount …

One of the things that I think is important in any discussion of averages is what is called the “Flaw of Averages.” Boiled down simply, it is that while averages may be used to describe the population the data describes, all too often no single individual data-point will ever fit that average profile. Particularly if more than a single descriptor is used.
See this article: https://www.thestar.com/news/insight/2016/01/16/when-us-air-force-discovered-the-flaw-of-averages.html
So, the question is simply this: the average can be determined, but does it really say anything at all if it is used in an effort to apply that average to what is actually occurring?

Chaotic dynamical systems tend to produce distributions that are the “opposite” of normal distributions, that is, they are heavily weighted at the tails. Averages are least applicable to such systems.

The other day I was listening to some blather about the Big Bang being preceded by (even though there is no time yet ???) a singularity that was, among other properties, infinitely hot. But temperature is a measure of average kinetic energy. How can a singularity have an average in the form of temperature? Worse, kinetic energy implies velocity, and velocity implies space over time. I thought those had not “unfurled” yet.

Allow me to inject a curious, yet distantly related fact, into a most enlightening discussion:
Jan Kareš from the Czech Republic did 232 pull-ups on the 19th of June in 2010, establishing what is believed to be the world record for this exercise. My max pull ups ever was 15, some multiple decades ago.
Has anyone determined whether Thursdays in Maharishi Vedic City, Iowa have gotten warmer over the past ten or so years? Then we have to ask, “Warmer how?” … right? — “warmer” at what particular time of day?, … are we talking rainy days?, … cloudy days?, where exactly — under a tree – WHICH tree?, ten feet above the ground?, ten and a HALF feet?, … who measured it? … using what sort of instrument?, … was this person patient enough to use the measuring instrument competently? … was he she sober? … etc.
… seems to be somewhat elusive.

David in Cal

Good article, as far as it goes. However, there are two economic factors that also need to be considered
1. Number of people per household has been shrinking. So, income per person is growing faster than income per household.
2. The quintiles are not made up of the same households, from year to year. Families frequently move to different quintiles over time. I personally have been in all five income quintiles at various points in time.

drednicolson

Also an illustration of the nirvana fallacy, aka “making the perfect the enemy of the good”.

3 inches is about 5% of the height of the boys. Maybe a better analogy would have been taking a trend of 1/10th of an inch per year and using it to justify a model that says it should be built 10 inches higher for future generations (even though the change in the past 20 years has been less than predicted).