Guest essay by Dr Tim Ball
Never try to walk across a river just because it has an average depth of four feet. Martin Friedman
“Statistics: The only science that enables different experts using the same figures to draw different conclusions.“ Evan Esar
I am not a statistician. I took university level statistics because I knew, as a climatologist, I needed to know enough to ask statisticians the right questions and understand the answers. I was mindful of what the Wegman Committee later identified as a failure of those working on the Intergovernmental Panel on Climate Change (IPCC) paleoclimate reconstructions.
It is important to note the isolation of the paleoclimate community; even though they rely heavily on statistical methods they do not seem to be interacting with the statistical community.
Apparently they knew their use and abuse of statistics and statistical methods would not bear examination. It was true of the “hockey stick”, an example of misuse and creation of ‘unique’ statistical techniques to predetermine the result. Unfortunately this is an inherent danger in statistics. A statistics professor told me that the more sophisticated the statistical technique, the weaker the data. Anything beyond basic statistical techniques was ‘mining’ the data and moving further from reality and reasonable analysis. This is inevitable in climatology because of inadequate data. As the US National Research Council Report of Feb 3, 1999 noted,
“Deficiencies in the accuracy, quality and continuity of the records place serious limitations on the confidence that can be placed in the research results.”
Methods in Climatology by Victor Conrad is a classic text that identified most of the fundamental issues in climate analysis. Its strength is it realizes the amount and quality of the data is critical, a theme central to Hubert Lamb’s establishing the Climatic Research Unit (CRU). In my opinion statistics as applied in climate has advanced very little since. True, we now have other techniques like spectral analysis, but it all those techniques, is meaningless if you don’t accept that cycles exist or have records of adequate quality and length.
Ironically, some techniques such as moving averages, remove data. Ice core records are a good example. The Antarctic ice core graphs, first presented in the 1990s, illustrate statistician William Briggs’ admonition.
Now I’m going to tell you the great truth of time series analysis. Ready? Unless the data is measured with error, you never, ever, for no reason, under no threat, SMOOTH the series! And if for some bizarre reason you do smooth it, you absolutely on pain of death do NOT use the smoothed series as input for other analyses! If the data is measured with error, you might attempt to model it (which means smooth it) in an attempt to estimate the measurement error, but even in these rare cases you have to have an outside (the learned word is “exogenous”) estimate of that error, that is, one not based on your current data. (His bold)
A 70 – year smoothing average was applied to the Antarctic ice core records. It eliminates a large amount of what Briggs calls “real data” as opposed to “fictional data” created by the smoothing. The smoothing diminishes a major component of basic statistics, standard deviation of the raw data. It is partly why it received little attention in climate studies, yet is a crucial factor in the impact of weather and climate on flora and fauna. The focus on averages and trends was also responsible. More important from a scientific perspective is its importance for determining mechanisms.
Figure 1: (Partial original caption) Reconstructed CO2 concentrations for the time interval ca8700 and ca6800 calendar years B.P based on CO2 extracted from air in Antarctica ice of Taylor Dome (left curve; ref.2; raw data available via www.ngdc.noaa.gov/paleo/taylor/taylor.html) and SI data for fossil B. pendula and B.pubescens from Lake Lille Gribso, Denmark. The arrows indicate accelerator mass spectrometry 14C chronologies used for temporal control. The shaded time interval corresponds to the 8.2-ka-B.P. cooling event.
Source: Proc. Natl. Acad. Sci. USA 2002 September 17: 99 (19) 12011 -12014.
Figure 1 shows a determination of atmospheric CO2 levels for a 2000-year span comparing data from a smoothed ice core (left) and stomata (right). Regardless of the efficacy of each method of data extraction, it is not hard to determine which plot is likely to yield the most information about mechanisms. Where is the 8.2-ka-BP cooling event in the ice core curve?
At the beginning of the 20th century statistics was applied to society. Universities previously divided into the Natural Sciences and Humanities, saw a new and ultimately larger division emerge, the Social Sciences. Many in the Natural Sciences view Social Science as an oxymoron and not a ‘real’ science. In order to justify the name, social scientists began to apply statistics to their research. A book titled “Statistical Packages for the Social Sciences” (SPSS) first appeared in 1970 and became the handbook for students and researchers. Plug in some numbers and the program provides results. Suitability of data, such as the difference between continuous and discrete numbers, and the technique were little known or ignored, yet affected the results.
Most people know Disraeli’s comment, “There are three kinds of lies: lies, damn lies and statistics”, but few understand how application of statistics affects their lives. Beyond inaccurate application of statistics is the elimination of anything beyond one standard deviation, which removes the dynamism of society. Macdonald’s typifies the application of statistics – they have perfected mediocrity. We sense it when everything sort of fits everyone, but doesn’t exactly fit anyone.
Statistics in Climate
Climate is an average of the weather over time or in a region and until the 1960s averages were effectively the only statistic developed. Ancient Greeks used average conditions to identify three global climate regions, the Torrid, Temperate, and Frigid Zones created by the angle of the sun. Climate research involved calculating and publishing average conditions at individual stations or in regions. Few understand how meaningless a measure it is, although Robert Heinlein implied it when he wrote, “Climate is what you expect, weather is what you get”. Mark Twain also appears aware with his remark that, “Climate lasts all the time, and weather only a few days.” A farmer asked me about the chances of an average summer. He was annoyed with the answer “virtually zero” because he didn’t understand that ‘average’ is a statistic. A more informed question is whether it will be above or below average, but that requires knowledge of two other basic statistics, the variation and the trend.
After WWII predictions for planning and social engineering emerged as postwar societies triggered development of simple trend analysis. It assumed once a trend started it would continue. The mentality persists despite evidence of downturns or upturns; in climate it seems to be part of the rejection of cycles.
Study of trends in climate essentially began in the 1970s with the prediction of a coming mini ice age as temperatures declined from 1940. When temperature increased in the mid-1980s they said this new trend would continue unabated. Political users of climate adopted what I called the trend wagon. The IPCC made the trend inevitable by saying human CO2 was the cause and it would continue to increase as long as industrial development continued. Like all previous trends, it did not last as temperatures trended down after 1998.
For year-to-year living and business the variability is very important. Farmers know you don’t plan next year’s operation on last year’s weather, but reduced variability reduces risk considerably. The most recent change in variability is normal and explained by known mechanisms but exploited as abnormal by those with a political agenda.
John Holdren, Obama’s science Tsar, used the authority of the White House to exploit increased variation of the weather and a mechanism little known to most scientists let alone the public, the circumpolar vortex. He created an inaccurate propaganda release about the Polar Vortex to imply it was something new and not natural therefore due to humans. Two of the three Greek climate zones are very stable, the Tropics and the Polar regions. The Temperate zone has the greatest short-term variability because of seasonal variations. It also has longer-term variability as the Circumpolar Vortex cycles through Zonal and Meridional patterns. The latter creates increased variation in weather statistics, as has occurred recently.
IPCC studies and prediction failures were inevitable because they lack data, manufacture data, lack knowledge of mechanisms and exclude known mechanism. Reduction or elimination of the standard deviation leads to loss of information and further distortion of the natural variability of weather and climate, both of which continue to occur within historic and natural norms.