Guest essay by Dr. Tim Ball
“If you torture the data enough, nature will always confess” – Ronald Coase.
Facts are stubborn things, but statistics are more pliable. – Anonymous.
Climatology is the study of average weather over time or in a region. It is very different than Climate Science, which is the study by specialists of individual components of the complex system that is weather. Each part is usually studied independent of the entire system and even how it interacts or influences the larger system. A supposed link between the parts is the use of statistics. Climatology has suffered from a pronounced form of the average versus the discrete problem from the early 1980s when computer modelers began to dominate the science. Climatology was doomed to failure from then on, only accelerated by its hijacking for a political agenda. I witnessed a good example early at a conference in Edmonton on Prairie Climate predictions and the implications for agriculture.
It was dominated by the keynote speaker, a climate modeler, Michael Schlesinger. His presentation compared five major global models and their results. He claimed that because they all showed warming they were valid. Of course they did because they were programmed to that general result. The problem is they varied enormously over vast regions. For example, one showed North America cooling another showed it warming. The audience was looking for information adequate for planning and became agitated, especially in the question period. It peaked when someone asked about the accuracy of his warmer and drier prediction for Alberta. The answer was 50%. The person replied that is useless, my Minister needs 95%. The shouting intensified.
Eventually a man threw his shoe on the stage. When the room went silent he said, “I didn’t have a towel”. We learned he had a voice box and the shoe was the only way he could get attention. He asked permission to go on stage where he explained his qualifications and put a formula on the blackboard. He asked Schlesinger if this was the formula he used as the basis for his model of the atmosphere. Schlesinger said yes. The man then proceeded to eliminate variables asking Schlesinger if they were omitted in his work. After a few eliminations he said one was probably enough, but you have no formula left and you certainly don’t have a model. It has been that way ever since with the computer models.
Climate is an average, and in the early days averages were the only statistic determined. In most weather offices the climatologist’s job was to produce monthly and annual averages. The subject of climatology was of no interest or concern. The top people were forecasters who were meteorologists with only learning in physics of the atmosphere. Even now few know the difference between a meteorologist and a climatologist. When I sought my PhD essentially only two centers of climatology existed, Reid Bryson’s center in Wisconsin and Hubert Lamb’s Climatic Research Unit (CRU) at East Anglia. Lamb set up there because the national weather office wasn’t interested in climatology. People ridiculed my PhD being in the Geography Department at the University of London, but university departments weren’t doing such work. Geography accommodated it because of its chorologic objectives. (The study of the causal relationships between geographic phenomena in a region.)
Disraeli’s admonition of lies, damn lies and statistics was exemplified by the work of the IPCC and its supporters. I realized years ago that the more sophisticated the statistical technique the more likely the data was inadequate. In climate the data was inadequate from the start as Lamb pointed out when he formed the CRU. He wrote in his autobiography “…it was clear that the first and greatest need was to establish the facts of the past record of the natural climate in times before any side effects of human activities could well be important.” It is even worse today. Proof of the inadequacy is the increasing use of more bizarre statistical techniques. Now they invent data such as in parameterization. Now they use output of one statistical contrivance or model as real data in another model.
The climate debate cannot be separated from environmental politics. Global warming became the central theme of the claim humans are destroying the planet promoted by the Club of Rome. Their book, Limits to Growth did two major things both removing understanding and creating a false sense of authority and accuracy. First, was the simplistic application of statistics beyond an average in the form of a straight-line trend analysis: Second, predictions were given awesome, but unjustified status, as the output of computer models. They wanted to show we were heading for disaster and selected the statistics and process to that end. This became the method and philosophy of the IPCC. Initially, we had climate averages. Then in the 1970s, with the cooling from 1940, trends became the fashion. Of course, the cooling trend did not last and was replaced in the 1980s by an equally simplistic warming trend. Now they are trying to ignore another cooling trend.
One problem developed with switching from average to trend. People trying to reconstruct historic averages needed a period in the modern record for comparison. The 30-year Normal was created with 30 chosen because it is a statistically significant sample, n, in any population N. The first one was the period 1931-1960, because it was believed to have the best instrumental data sets. They keep changing the 30-year period, which only adds to the confusion. It is also problematic because the number of stations has reduced significantly. How valid are the studies done using earlier “Normal periods”?
Unfortunately, people started using the Normal for the wrong purposes. Now it is used as the average weather overall. It is only the average weather for a 30-year period. Actually it is inappropriate for climate because most changes occur over longer periods.
But there is another simple statistical measure they effectively ignore. People, like farmers, who use climate data in their work know that a most important statistic is variation. Climatology was aware of this decades ago as it became aware of changing variability, especially of mid-latitude weather, with changes in upper level winds. It was what Lamb was working on and Leroux continued.
Now, as the global trend swings from warming to cooling these winds switched from zonal to meridional flow causing dramatic increases in variability of temperature and precipitation. The IPCC, cursed with the tunnel vision of political objectives and limited by their terms of reference did not accommodate natural variability. They can only claim, incorrectly, that the change is proof of their failed projections.
Edward Wegman in his analysis of the “hockey stick” issue for the Barton Congressional committee identified a bigger problem in climate science when he wrote:
“We know that there is no evidence that Dr. Mann or any of the authors in paleoclimatology studies have had significant interactions with mainstream statisticians.
This identifies the problem that has long plagued the use of statistics, especially in the Social Sciences, namely the use of statistics without knowledge or understanding.
Many used a book referred to as SPSS, (it is still available) the acronym for Statistical Packages for the Social Sciences. I know of people simply plugging in numbers and getting totally irrelevant results. One misapplication of statistics undermined the career of an English Geomorphologist who completely misapplied a Trend Surfaces analysis.
IPCC projections fail for many inappropriate statistics and statistical methods. Of course, it took a statistician to identify the corrupted use of statistics to show how they fooled the world into disastrous policies, but that only underlines the problem with statistics as the two opening quotes attest.
There is another germane quote by mathematician and philosopher A.N. Whitehead about the use, or misuse, of statistics in climate science.
There is no more common error than to assume that, because prolonged and accurate mathematical calculations have been made, the application of the result to some fact of nature is absolutely certain.
_______________
Other quotes about statistics reveal a common understanding of their limitations and worse, their application. Here are a few;
He uses statistics as a drunken man uses lampposts – for support rather than for illumination. – Andrew Lang.
One more fagot (bundle) of these adamantine bandages is the new science of statistics. – Ralph Waldo Emerson
Then there is the man who drowned crossing a stream with an average depth of six inches. – W E Gates.
Satan delights equally in statistics and in quoting scripture. – H G Wells
A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. – M J Moroney.
Statistics are the modern equivalent of the number of angels on the head of a pin – but then they probably have a statistical estimate for that. – Tim Ball
__._,_.___
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
DaveGinOly says: October 3, 2013 at 1:21 pm
“…Its pretty obvious that if you create a simulation in which heat is driven by a factor in the simulation, an increase in that factor will yield more heat …”
I personally believe you are exactly correct Dave, and that it is that simple:
If one of your parameter changes is that heat is added to the system, and there are no major negative feedbacks, the system must warm, no matter how many hoops your computer program jumps through.
Anyway some commenters notably AlecM (posts on WUWT as AlecMM) suggested looking at Hansen_etal_1981 (if you Googled this it would come up). AlecM believes that Hansen has miscalculated the greenhouse effect and that more focus should be on LTE in the atmosphere (sorry for the paraphrase). What struck me was that Hansen attributes 90% of warming to the 15 micron band -”downwelling radiation” – in essence that Co2 is the main driver of climate. There is a reference for this pointing to an earlier paper, which then relates to Manabe and then finally to the first recent era estimation of atmosphere heating due to greenhouse gases – by Moller.
I think I’ve already read the paper in question, and indeed consider its figure 7 to ge the direct precursor to figure 1.4 in AR5. I spent a quiet, happy day a couple of years ago attaching the UAH LTT onto the Hansen’s graph to note that the entire warming of the 1980s is solidly within his estimate of “natural variation” (in spite of the fact that it curves downward for no reasons he could possibly justify given the mean warming without CO_2 of the previous two centuries) and is already solidly distinguishable from his three climate sensitivity predictions.
Note well that he doesn’t actually claim that the CO_2 is the main driver of the climate — the whole point of his “climate sensitivity” estimates which were and continue to be the “catastrophic” part of the problem is that he assumes that the warming due to CO_2 alone will be multiplied by a factor of 2 to 5 by water vapor. IIRC, he describes this in some detail in the paper, and back in 1981, at least, he was still somewhat uncertain about its feedback except that it would definitely be positive, hence he posted the three curves that would produce anywhere from 1.4 to 5.6 C of warming by 2100. 1.4 was “CO_2 alone”.
In 1981 it was basically impossible to compute anything but nearly trivial single slab models — 1D spatiotemporal average reductions of the Earth and atmosphere and oceans. The Cray 1 owned by NCAR at the time had 8 MB of core (1 million 64 bit words) and operated at a whopping 80 MHz, and on a good day could run at around 100 MFLOPS The computer I’m typing this reply on, an aged laptop with an Intel CoreDuo with 4 GB of RAM and running 2.5 GHz, has more effective computational power than all the supercomputers on Earth put together at that time. Hell, my CELL PHONE has more compute power, by far, than any supercomputer of that day. It would have been considered a munition with export restrictions at the time lest the Evil Empire use it to design nuclear bombs. (There’s an app for that:-).
However, that was then, this is now. Grant Petty pointed out in another conversation I’ve been participating in (that I’ve just retired from, as it was with the PSI whackos and going nowhere as usual) that anyone can go grab the source for e.g. CAM 3.x, a genuine open source GCM (written, sadly, in Fortran90) and perhaps more usefully, access and read its documentation. To the extent that CAM 3.x is typical, it is no longer the case that GCMs are in any sense “simple” single slab models. The partition the atmosphere up into many layers, over an (IMO unbelievably stupid) regular but weakly adaptive latitude/longitude decomposition of the Earth’s surface into cells. CAM 3 at least does a single slab ocean (basically only considering the surface, no real deep ocean heat transport) but it very definitely has a radiation physics engine that is a lot more sophisticated than Hansen’s 1981 paper. It also devotes a considerable amount of code to handling e.g. the water cycle, to aerosols, etc.
The point is that most of the criticisms of GCMs (that I’ve read on e.g. WUWT) are levelled at not the actual structure of the GCM, but at a sort of “toy reduction” of the GCM into a 1 dimensional single slab model, as if this were the mid-1950s and computers were a piece of paper and a pencil (and maybe a mechanical adding machine off to the side or a slide rule to help with the heavy lifting). They overestimate this, underestimate that. I’m guilty of this myself — we naturally try to simplify a complex thing so that we can discuss it at all, because believe me — I’ve LOOKED at the CAM 3.1 code and it is a poorly documented mess in my rather professional opinion (I teach programming among other things, and have written a few zillion lines of code on my own at this point in my life, starting with a plastic mechanical adder that worked with marbles used to activate plastic marble-track flip-flops that one hand programmed to add two numbers or do a simple binary multiplication back in 1965. I was 10. It was cool. (The CAM 3.1 TOPLEVEL documentation I refer to above is actually pretty good though. It’s just too far removed from the code itself.)
GCMs are not really susceptible to this sort of reductionist consideration, although we try. In the end, there are thousands of lines of code, many megabytes of initialization data, hundreds of control variables, hundreds of separate physical process computations controlled by those variables and coupled by a dynamical engine that has to manage everything from bulk mass transport to latent heat to parcel-to-parcel radiation.
Every single bit of this is subject to a rather staggering list of possible errors.
The code could have bugs. In fact, by the time you have this many lines of code, the code DOES have bugs, I promise — probability is near unity that there are non-fatal bugs. Having written code with deep, functional bugs before, they are by far the most difficult thing to expunge, because the only way to know they are there is by comparing the result from the computation to a result you know some other way. If the bug produces otherwise “sensible” (but incorrect) results you’ll literally have no reason to think it is there. I’m talking things as simple as typing N instead of M in a single line of code in a module where both make sense and are used, both have values with identical ranges, and where using N instead of M produces a perfectly sensible result. I had just this sort of bug in a huge program I wrote 20 years ago in Fortran and the only way I knew it was there is because the output had to satisfy a certain scaling and although it had exactly the right form (was sensible) it failed the scaling test. I was lucky — the program could have otherwise run for years producing wrong answers.
The code is implemented on a ludicrous grid. I’m serious, who in their right mind would perform NUMERICAL quadratures on a spherical manifold on a spherical polar grid? Did I mention that the bug above in my own code was in a module designed to perform quadratures (spherical harmonic projections) on a spherical polar grid? It seems so easy, so sensible — spherical polar coordinates are the natural analytic representation of a sphere, after all — until you hit that pesky Jacobean, the map problem. If you divide latitude and longitude into equal interval cells that are say 5×5 degrees they are nearly square cells 350 miles to the side at the equator and thin slices 350 miles north to south that you can STEP across near the poles. Nearly all quadrature methods rely on breaking a continuum up into some sort of a regular grid so one can do nice, simple polynomial fits across them, and while there are kludges — renormalizing the longitude grid size at some latitude, creating a complex map to account for the ever varying area of the cells, the ever varying VOLUME of the PARCELS you are trying to transport between the cells — all I can say is “eeeewww”. Been there, done that, not the right way.
The “right way” is to use a symmetric, uniformly scalable icosahedral tesselation of the sphere:
http://i.stack.imgur.com/Cyt2B.png
or better yet:
http://www.mathworks.com/matlabcentral/fx_files/37004/3/screen_shot.jpg
By being uniformly renormalizable/scalable, one can systematically run the code at different spatial resolutions without ANY tweaking and actually determine of the result is spatially convergent! The price you pay, of course, is that you have to hand re-code and test all of your quadrature and ODE solution and transport routines, because they are all DESIGNED FOR A RECTANGULAR PLANAR GRID! Adaptive quadrature on a sphere in polar coordinates, in contrast, just doesn’t work BECAUSE one cannot really properly handle the cell shape distortion — the map problem where antarctica is rendered huge when the world is presented as a large rectangle.
I frankly doubt that anybody in climate science has any idea how much error is introduced by this. CAM 3.0 contains a comment that when they DID renormalize the cell sizes near the poles, the code sped up and maybe got more accurate (understandable because by the time your grid is barely adequate for the equator, you’re parsing cells you could throw a softball over at the poles). At the very least it costs compute time, hence precision.
And then there is the physics. CAM 3.1 integrates time in hour long timesteps. Each hour, a staggering amount of physics per cell has to be updated on the basis of mass and energy transport. In each case the physical process is known but has to be approximated/averaged and renormalized to correspond to the (distorted, non-uniform) cells. The averaging ALREADY “integrates” over all shorter time/length scale processes, with the catch being that we do not know and cannot measure whether or not we are doing that integral correctly (the only way to do so is to use the GCM code at ever finer length/time granularity and demonstrate some sort of convergence using NON-approximated code). All of these processes are coupled by the coarse-grained computation, but have to be computed SEPARABLY in the step computations (otherwise you’d have to have radiative code in the middle of your latent heat code in the middle of your mass transport code in the middle of your albedo code) so that all sorts of dynamically coupled processes are again necessarily handled semi-heuristically as if they are independent and separate at the finest grained timescale used.
I’m not saying they are doing this wrong, only that it is essentially impossible to be certain that they are doing it right any other way besides COMPARING THE PREDICTIONS OF THE CODE TO REALITY!
This is the fundamental rule of predictive modeling in all venues. When you write a complex program to predict the future, you can claim anything you like about how fabulous, accurate, complete, well-written your code is. It can match the recent past (the training set) perfectly well, it can hindcast the remote past (where it is known) perfectly well (although AFAIK, NO GCM can hindcast the secular variations of any significant portion of the climate history across human historic, let alone geological timescales) but if it does not actually predict the future, it doesn’t work and your assertions of fabulosity and accuracy just aren’t to be taken seriously.
This is true one model at a time. The mean of three models that fail to predict the future one model at a time is not expected to do better than any of the three models — indeed, it CANNOT do better than the BEST model (think about it!) and one is better off using the best model, and better off still working on improving the models and not asserting that they work when they manifestly don’t.
This SHOULDN’T be all that big a deal. If your code doesn’t work, you work on it until it does. Again, a simple, obvious rule of computer programming. Your code can pass all sorts of internal tests and still be broken. Another simple rule of computer programming, and the reason that real code goes through alpha and beta test phases. In computerspeak, for all that it is “version 3”, cam 3.x is IMO no better than beta code — it is past alpha in that “it runs”, but is very definitely at the stage where one has to actually test it against reality, which is what beta testing IS. Does your game actually work when the consumer gets their hands on it, or does it lock up when you jerk the joystick and press fire three times within 1 second? My wife has a genius for revealing hidden flaws in supposedly release-ready computer code — a lot of times problems are revealed by somebody who uses it a way the designers didn’t anticipate and hence activate a completely untested pathway (one that often shows, when resolving the bug, that even in the supposedly TESTED pathways it isn’t working right, producing sensible but wrong answers or activity).
If I had a ten year grant for (say) a million dollars a year for salary, staff, computers, indirect costs, I’d personally tackle a complete white-room rewrite of e.g. CAM 3, taking its documentation and references only and re-implementing the whole thing in C or C++ on top of a scalable truncated icosahedral grid, starting by inventing and perfecting adaptive quadrature and interpolation routines, continuing by building a scalable, tessellated world map (needed to get height above sea level, surface albedo, nature of the surface — forest, rock, desert, ocean, ice — etc. right) and then slowly, gently, add physics on top of it, probably redoing the dynamical engine (which is likely to be necessary on top of the new grid). It would actually be fun, I think and would very likely take the rest of my professional career to accomplish. I might even throw in a few things I’ve learned and coded that can do EMPIRICAL nonlinear function approximation — neural networks, other sorts of Bayesian predictors — that I’d bet a substantial amount of money could easily be trained to beat the physics-based code. I’ve fantasized about building an empirically trained climate predictor (neural network) on an icosahedral grid for some time now. A network would discover things like the linear trend and simple nonlinear functional relationships that modify it quite quickly (and without ever “identifying” it as an actual linear function) — the really interesting thing would be seeing if it could actually predict trial set behavior outside of the range exemplified within the training set, the “challenge” of ALL predictive models. Interpolation is easy, extrapolation difficult.
rgb
Now here’s come good news. National Groundwater Assoc. reports that 94% of the EPA has been furloughed. Wouldn’t it be lovely if this worked out to be like a baseball strike? After living without them long enough, we could all learn that we could live without them.
rgbatduke says:
October 4, 2013 at 7:52 am
THANK YOU! (Can I find a bigger font? 8<) )
More seriously, should not the rock, ice, granite, copper, bronze, brass, iron, silver, and final gold standards be the "test" against which we measure the value of yesterday's and today's Global Circulation Models before we accept tomorrow's Global Circulation Models. (Granted, building a valid model has to be done – and by that I mean NOT rebuilding tomorrow's models using code copied yet agian from the 1980's.
Today's GCM began from the early computer models during Colorado's attempts to model the aerosol and sulfuric acid particles coming from extremely localized point sources inside local regions: Specifically the nickel smelter in Canada's north of Lake Erie and the smog/air particles getting trapped by the convection inversions in the Los Angeles basin. The first target was to force its shutdown during the "acid rain" scares, the second was a legitimate and very important attempt to analyze the LA Basin for smog control. The problem of CAGW starts with those "local" problems because those calculations (valid at medium latitudes of simple grid square approximations) was the foundation to "grow" those same approximations into a unified global-spanning approximation. Their error in Colorado was convincing a very, very willing group of other "enablers" that the small details of particle and vapor differential equations that can be properly integrated up from meter x meter x meter volumes into kilometer x kilometer x kilometer cubes – which can (sort of) work most of the time – into a 50,000 kilometer sphere valid not only over a single day, then week, then month, then year and multi-decade analysis of all temperatures and ground conditions.
Like the "models" of glacier ice flow: What is correct and evident and simple and measurable when looking at a meter of the "ideal" rock-ice "average" coefficients and slopes and densities and friction values is simply "integrated" over an "average" entire valley! The hills and cliffs and varying rock conditions and shapes and protuberances inside that valley (not to mention the changing slope fro top to bottom of a 100 km long valley or icecap!) are "smoothed over by one value. The equation is (almost) a valid approximation of an "ideal" glacier: but it fails because that computer model cannot predict the performance and speed and shape and melt rate and mass gain rate of any single glacier correctly for any single specific glacier in the real world.
So, how do you "test" a CGM?
The first standard should be the "rock in space": The GCM MUST be able to predict in 3D latitude and longitude at a simple 100 x 100 km grid the temperature of the entire actual moon's surface (no atmosphere, no ice, no water, no phase changes, solid simple rock types of only simple albedo that don't change over time.) They DO have to get right the heat lags and delays and surface exchange rates, the emissivities and outward long wave radiation, the rotation of the sphere, the changes in solar energy over a year's orbit, the spherical coordinate and grid problem, the basic radiation/absorption problems of albedo and emissivity. Throw in the back-of-moon (rough, mountains and crater texture, and the "front-of-moon problem of smooth "seas" and isolated white craters to prove that the various land masses and ocean areas can be modeled accurately in a heat transfer model. We have the Apollo surface temperature data spanning many decades: Why don't they show us the model that has "simple" lunar temperatures over every day of that entire period? If the GCM is "stable" then it MUST be able to "stabilize" to a continuous plot of the hour by hour temperature changes of the moon over that entire period of 5 decades.
Next: The "ice in space" – Add the ice and vapor changes: Accurately model a comet passing by the sun: Phase changes, deposit and emission changes of mass flows. various albedo black and white surfaces that change with time. Long term radiation change problems spanning centuries: See if the "ice standard" DOES predict the different comet glows over time.
Then go the "Copper and Brass and Bronze" models: Mercury, Venus, Mars. Do the GCM simulate over centuries the "no water vapor, no phase changes, no surface terrain or ice problems of those "simple" atmospheres?
THEN try the earth's simple one-layer or two layer atmosphere. This begins to test the actual oceans and ice and land mass changes of albedo over the year.
Yes, that does sound like a plan of sorts, doesn’t it. The closest I’ve heard of following this sort of thing is the water world test of four GCMs earlier this year. They all four converged to different answers. Of course we don’t know what the right answer is, but we can be certain that at least three of the four were wrong. But I do not pretend to know what steps modelers have taken to verify their models. One hopes they’ve done precisely what you suggest, very early on. As you say, if you cannot do the simple problems, how can you expect to do well on the difficult ones, and the Earth is difficult.
rgb