A few weeks ago I wrote a piece highlighting a comment made in the Hansen et al. paper, “Earth’s Energy Imbalance and Implications“, by James Hansen et al. (hereinafter H2011). Some folks said I should take a real look at Hansen’s paper, so I have done so twice, first a quick look at “Losing Your Imbalance“, and now this study. The claims and conclusions of the H2011 study are based mainly on the ocean heat content (OHC), as measured in large part by the data from the Argo floats, so I thought I should look at that data. The Argo temperature and salinity measurements form a great dataset that gives us much valuable information about the ocean. The H2011 paper utilizes the recent results from “How well can we derive Global Ocean Indicators from Argo data?“, by K. von Schuckmann and P.-Y. Le Traon. (SLT2011)
Figure 1. Argo float. Complete float is about 2 metres (6′) tall. SOURCE: Wikipedia
The Argo floats are diving floats that operate on their own. Each float measures one complete vertical temperature profile every ten days. The vertical profile goes down to either to 1,000 or 2,000 metres depth. It reports each dive’s results by satellite before the next dive.
Unfortunately, as used in H2011, the Argo data suffers from some problems. The time span of the dataset is very short. The changes are quite small. The accuracy is overestimated. Finally, and most importantly, the investigators are using the wrong method to analyze the Argo data.
First, the length of the dataset. The SLT2011 data used by Hansen is only 72 months long. This limits the conclusions we can draw from the data. H2011 gets around that by only showing a six-year moving average of not only this data, but all the data he used. I really don’t like it when raw data is not shown, only smoothed data as Hansen has done.
Second, the differences are quite small. Here is the record as shown in SLT2011. They show the data as annual changes in upper ocean heat content (OHC) in units of joules. I have converted OHC change (Joules/m2) to units of degrees Celsius change for the water they are measuring, which is a metre-square column of water 1,490 metres deep. As you can see, SLT2011 is discussing very small temperature variations. The same is true of the H2011 paper.
Figure 2. Upper ocean temperatures from Schuckmann & Le Traon, 2011 (SLT2011). Grey bars show one sigma errors of the data. Red line is a 17-point Gaussian average. Vertical red line shows the error of the Gaussian average at the boundary of the dataset (95% CI). Data digitized from SLT2011, Figure 5 b), available as a comma-separated text file here.
There are a few things of note in the dataset. First, we’re dealing with minuscule temperature changes. The length of the gray bars shows that SLT2011 claims that we can measure the temperature of the upper kilometer and a half of the ocean with an error (presumably one sigma) of only ± eight thousandths of a degree …
Now, I hate to argue from incredulity, and I will give ample statistical reasons further down, but frankly, Scarlett … eight thousandths of a degree error in the measurement of the monthly average temperature of the top mile of water of almost the entire ocean? Really? They believe they can measure the ocean temperature to that kind of precision, much less accuracy?
I find that very difficult to believe. I understand the law of large numbers and the central limit theorem and how that gives us extra leverage, but I find the idea that we can measure the temperature of four hundred million cubic kilometres of ocean water to a precision of ± eight thousandths of a degree to be … well, let me call it unsubstantiated. Others who have practical experience in measuring the temperatures of liquids to less than a hundredth of a degree, feel free to chime in, but to me that seems like a bridge way too far. Yes, there are some 2,500 Argo floats out there, and on a map the ocean looks pretty densely sampled. Figure 3 shows where the Argo floats were in 2011.
Figure 3. Locations of Argo floats, 2011. SOURCE
But that’s just a chart. The world is unimaginably huge. In the real ocean, down to a kilometer and a half of depth, that’s one Argo thermometer for each 165,000 cubic kilometers of water … I’m not sure how to give an idea of just how big that is. Let’s try it this way. Lake Superior is the largest lake in the Americas, visible even on the world map above. How accurately could you measure the average monthly temperature of the entire volume of Lake Superior with one Argo float? Sure, you can let it bob up and down, and drift around the lake, it will take three vertical profiles a month. But even then, every measurement it will only cover a tiny part of the entire lake.
But it’s worse for Argo. Each of the Argo floats, each dot in Figure 3, is representing a volume as large as 13 Lake Superiors … with one lonely Argo thermometer …
Or we could look at it another way. There were about 2,500 Argo floats in operation over the period covered by SLT2011. The area of the ocean is about 360 million square km. So each Argo float represents an area of about 140,000 square kilometres, which is a square about 380 km (240 mi) on each side. One Argo float for all of that. Ten days for each dive cycle, wherein the float goes down to about 1000 metres and stays there for nine days. Then it either rises from there, or it descends to about 2,000 metres, then rises to the surface at about 10 cm (4″) per second over about six hours profiling the temperature and salinity as it rises. So we get three vertical temperature profiles from 0-1,000 or 0-1,500 or 0-2,000 metres each month depending on the particular float, to cover an area of 140,000 square kilometres … I’m sorry, but three vertical temperature profiles per month to cover an area of 60,000 square miles and a mile deep doesn’t scream “thousandths of a degree temperature accuracy” to me.
Here’s a third way to look at the size of the measurement challenge. For those who have been out of sight of land in a small boat, you know how big the ocean looks from the deck? Suppose the deck of the boat is a metre (3′) above the water, and you stand up on deck and look around. Nothing but ocean stretching all the way to the horizon, a vast immensity of water on all sides. How many thermometer readings would it take to get the monthly average temperature of just the ocean you can see, to the depth of one mile? I would say … more than one.
Now, consider that each Argo float has to cover an area that is more than 2,000 times the area of the ocean you can see from your perch standing there on deck … and the float is making three dives per month … how well do the measurements encompass and represent the reality?
There is another difficulty. Figure 2 shows that most of the change over the period occurred in a single year, from about mid 2007 to mid 2008. The change in forcing required to change the temperature of a kilometre and a half of water that much is about 2 W/m2 for that year-long period. The “imbalance”, to use Hansen’s term, is even worse when we look at the amount of energy required to warm the upper ocean from May 2007 to August 2008. That requires a global “imbalance” of about 2.7 W/m2 over that period.
Now, if that were my dataset, the first thing I’d be looking at is what changed in mid 2007. Why did the global “imbalance” suddenly jump to 2.7 W/m2? And more to the point, why did the upper ocean warm, but not the surface temperature?
I don’t have any answers to those questions, my first guess would be “clouds” … but before I used that dataset, I’d want to go down that road to find out why the big jump in 2007. What changed, and why? If our interest is in global “imbalance”, there’s an imbalance to study.
(In passing, let me note that there is an incorrect simplifying assumption to eliminate ocean heat content in order to arrive at the canonical climate equation. That canonical equation is
Change In Temperature = Sensitivity times Change In Forcing
The error is to assume that the change in oceanic heat content (OHC) is a linear function of surface temperature change ∆T. It is not, as the Argo data confirms … I discussed this error in a previous post, “The Cold Equations“. But I digress …)
The SLT2011 Argo record also has an oddity shared by some other temperature records. The swing in the whole time period is about a hundredth of a degree. The largest one-year jump in the data is about a hundredth of a degree. The largest one-month jump in the data is about a hundredth of a degree. When short and long time spans show the same swings, it’s hard to say a whole lot about the data. It makes the data very difficult to interpret. For example, the imbalance necessary to give the largest one-month change in OHC is about 24 W/m2. Before moving forwards, changes in OHC like that would be worth looking at to see a) if they’re real and b) if so, what changed, before moving forwards …
In any case, that was my second issue, the tiny size of the temperature differences being measured.
Next, coverage. The Argo analysis of SLT2011 only uses data down to 1,500 metres depth. They say that the Argo coverage below that depth is too sparse to be meaningful, although the situation is improving. In addition, the Argo analysis only covers from 60°N to 60°S, which leaves out the Arctic and Southern Oceans, again because of inadequate coverage. Next, it starts at 10 metres below the surface, so it misses the crucial surface layer which, although small in volume, undergoes large temperature variations. Finally, their analysis misses the continental shelves because it only considers areas where the ocean is deeper than one kilometre. Figure 4 shows how much of the ocean volume the Argo floats are actually measuring in SLT2011, about 31%
In addition to the amount measured by Argo floats, Figure 4 shows that there are a number of other oceanic volumes. H2011 includes figures for some of these, including the Southern Ocean, the Arctic Ocean, and the Abyssal waters. Hansen points out that the source he used (Purkey and Johnson, hereinafter PJ2010) says there is no temperature change in the waters between 2 and 4 km depth. This is most of the water shown on the right side of Figure 4. It is not clear how the bottom waters are warming without the middle waters warming. I can’t think of how that might happen … but that’s what PJ2010 says, that the blue area on the right, representing half the oceanic volume, is not changing temperature at all.
Neither H2011 nor SLT2011 offer an analysis of the effect of omitting the continental shelves, or the thin surface layer. In that regard, it is worth noting that a ten-metre thin surface layer like that shown in Figure 4 can change by a full degree in temperature without much problem … and if it does so, that would be about the same change in ocean heat content as the 0.01°C of warming of the entire volume measured by the Argo floats. So that surface layer is far too large a factor to be simply omitted from the analysis.
There is another problem with the figures H2011 use for the change in heat content of the abyssal waters (below 4 km). The cited study, PJ2010, says:
Excepting the Arctic Ocean and Nordic seas, the rate of abyssal (below 4000 m) global ocean heat content change in the 1990s and 2000s is equivalent to a heat flux of 0.027 (±0.009) W m−2 applied over the entire surface of the earth. SOURCE: PJ2010
That works out to a claimed warming rate of the abyssal ocean of 0.0007°C per year, with a claimed 95% confidence interval of ± 0.0002°C/yr. … I’m sorry, but I don’t buy it. I do not accept that we know the rate of the annual temperature rise of the abyssal waters to the nearest two ten-thousandths of a degree per year, no matter what PJ2010 might claim. The surface waters are sampled regularly by thousands of Argo floats. The abyssal waters see the odd transect or two per decade. I don’t think our measurements are sufficient.
One problem here, as with much of climate science, is that the only uncertainty that is considered is the strict mathematical uncertainty associated with the numbers themselves, dissociated from the real world. There is an associated uncertainty that is sometimes not considered. This is the uncertainty of how much your measurement actually represents the entire volume or area being measured.
The underlying problem is that temperature is an “intensive” quality, whereas something like mass is an “extensive” quality. Measuring these two kinds of things, intensive and extensive variables, is very, very different. An extensive quality is a quality that changes with the amount (the “extent”) of whatever is being measured. The mass of two glasses of water at 40° temperature is twice the mass of one glass of water at 40° temperature. To get the total mass, we just add the two masses together.
But do we add the two 40° temperatures together to get a total temperature of 80°? Nope, it doesn’t work that way, because temperature is an intensive quality. It doesn’t change based on the amount of stuff we are measuring.
Extensive qualities are generally easy to measure. If we have a large bathtub full of water, we can easily determine its mass. Put it on a scale, take one single measurement, you’re done. One measurement is all that is needed.
But the average temperature of the water is much harder to determine. It requires simultaneous measurement of the water temperature in as many places as are required. The number of thermometers required depends on the accuracy you need and the amount of variation in the water temperature. If there are warm spots or cold parts of the water in the tub, you’ll need a lot of thermometers to get an average that is accurate to say a tenth of a degree.
Now recall that instead of a bathtub with lots of thermometers, for the Argo data we have a chunk of ocean that’s 380 km (240 miles) on a side with a single Argo float taking its temperature. We’re measuring down a kilometre and a half (about a mile), and we get three vertical temperature profiles a month … how well do those three vertical temperature profiles characterize the actual temperature of sixty thousand square miles of ocean? (140,000 sq. km.)
Then consider further that the abyssal waters have far, far fewer thermometers way down there … and yet they claim even greater accuracies than the Argo data.
Please be clear that my argument is not about the ability of large numbers of measurements to improve the mathematical precision of the result. We have about 7,500 Argo vertical profiles per month. With the ocean surface divided into 864 gridboxes, if the standard deviation (SD) of the depth-integrated gridbox measurements is about 0.24°C, this is enough to give us mathematical precision of the order of magnitude that they have stated. The question is whether the SD of the gridboxes is that small, and if so, how they got that small.
They discuss how they did their error analysis. I suspect that their problem lies in two areas. One is I see no error estimate for the removal of the “climatology”, the historical monthly average, from the data. The other problem involves the arcane method used to analyze the data by gridding the data both horizontally and vertically. I’ll deal with the climatology question first. Here is their description of their method:
2.2 Data processing method
An Argo climatology (ACLIM hereinafter, 2004–2009, von Schuckmann et al., 2009) is first interpolated on every profile position in order to fill gappy profiles at depth of each temperature and salinity profile. This procedure is necessary to calculate depth-integrated quantities. OHC [ocean heat content], OFC [ocean freshwater content] and SSL [steric (temperature related) sea level] are then calculated at every Argo profile position as described in von Schuckmann et al. (2009). Finally, anomalies of the physical properties at every profile position are calculated relative to ACLIM.
Terminology: a “temperature profile” is a string of measurements taken at increasing depths by an Argo float. A “profile position” is one of the preset pressure levels at which the Argo floats are set to take a sample.
This means that if there is missing data in a given profile, it is filled in using the “climatology”, or the long-term average of the data for that month and place. Now, this is going to introduce an error, not likely large, and one that they account for.
What I don’t find accounted for in their error calculation is any error estimate related to the final sentence in the paragraph above. That sentence describes the subtraction of the ACLIM climatology from the data. ACLIM is an “Argo climatology”, which is a month-by-month average of the average temperatures of each depth level.
SLT2011 refers this question to an earlier document by the same authors, SLT2009, which describes the creation of the ACLIM climatology. I find that there are over 150 levels in the ACLIM climatology, as described by the authors:
The configuration is defined by the grid and the set of a priori information such as the climatology, a priori variances and covariances which are necessary to compute the covariance matrices. The analyzed field is defined on a horizontal 1/2° Mercator isotropic grid and is limited from 77°S to 77°N. There are 152 vertical levels defined between the surface and 2000m depth … The vertical spacing is 5m from the surface down to 100m depth, 10m from 100m to 800m and 20m from 800m down to 2000m depth.
So they have divided the upper ocean into gridboxes, and each gridbox into layers, to give gridcells. How many gridcells? Well, 360 degrees longitude * 2 * 180 degrees latitude * 2 * 70% of the world is ocean * 152 layers = 27,578,880 oceanic gridcells. Then they’ve calculated the month by month average temperature of each of those twenty-five million oceanic volumes … a neat trick. Clearly, they are interpolating like mad.
There are about 450,000 discrete ocean temperatures per month reported by the Argo floats. That means that each of their 25 million gridcells gets its temperature taken on average once every five years …
That is the “climatology” that they are subtracting from each “profile position” on each Argo dive. Obviously, given the short history of the Argo dataset, the coverage area of 60,000 sq. miles (140,000 sq. km.) per Argo float, and the small gridcell size, there are large uncertainties in the climatology.
So when they subtract a climatology from an actual measurement, the result contains not just the error in the measurement. It contains the error in the climatology as well. When we are doing subtraction, errors add “in quadrature”. This means the resultant error is the square root of the sum of the squares of the errors. It also means that the big error rules, particularly when one error is much larger than the other. The temperature measurement at the profile position has just the instrument error. For Argo, that’s ± 0.005°C. The climatology error? Who knows, when the volumes are only sampled once every five years? But it’s much more than the instrument error …
So that’s the main problem I see with their analysis. They’re doing it in a difficult-to-trace, arcane, and clunky way. Argo data, and temperature data in general, does not occur in some gridded world. Doing the things they do with the gridboxes and the layers introduces errors. Let me show you one example of why. Figure 5 shows the depth layers of 5 metres used in the upper shallower section of the climatology, along with the records from one Argo float temperature profile.
Figure 5. ACLIM climatology layers (5 metre). Red circles show the actual measurements from a single Argo temperature profile. Blue diamonds show the same information after averaging into layers. Photo Source
Several things can be seen here. First, there is no data for three of the climatology layers. A larger problem is that when we average into layers, in essence we assign that averaged value to the midpoint in the layer. The problem with this procedure arises because in the shallows, the Argo floats sample at slightly less than 10 metre intervals. So the upper measurements are just above the bottom edge of the layer. As a result when they are averaged into the layers, it is as though the temperature profile has been hoisted upwards by a couple of metres. This introduces a large bias into the results. In addition, the bias is depth-dependent, with the shallows hoisted upwards, but deeper sections moved downwards. The error is smallest below 100 metres, but gets large quite quickly after that because of the change in layer thickness to 10 metres.
Finally, we come to the question of the analysis method, and the meaning of the title of this post. The SLT2011 document goes on to say the following:
To estimate GOIs [global oceanic indexes] from the irregularly distributed profiles, the global ocean is divided into boxes of 5° latitude, 10° longitude and 3-month size. This provides a sufficient number of observations per box. To remove spurious data, measurements which depart from the mean at more than 3 times the standard deviation are excluded. The variance information to build this criterion is derived from ACLIM. This procedure excludes about 1 % of data from our analysis. Only data points which are located over bathymetry deeper than 1000 m depth are then kept. Boxes containing less than 10 measurements are considered as a measurement gap.
Now, I’m sorry, but that’s just a crazy method for analyzing this kind of data. They’ve taken the actual data. Then they’ve added “climatology” data where there were gaps, so everything was neat and tidy. Then they’ve subtracted the “climatology” from the whole thing, with an unknown error. Then the data is averaged into gridboxes of five by ten degrees, and into 150 levels below the surface, of varying thickness, and then those are averaged over a three-month period … that’s all un-necessary complexity. This is a problem that once again shows the isolation of the climate science community from the world of established methods.
This problem, of having vertical Argo temperature profiles at varying locations and wanting to estimate the temperature of the unseen remainder based on the profiles, is not novel or new at all. In fact, it is precisely the situation faced by every mining company with regards to their test drill hole results. Exactly as with Argo data, the mining companies have vertical profiles of the composition of the subsurface reality at variously spaced locations. Again just as with argo, from that information, the mining companies need to estimate the parts of the underground world that they cannot see.
But these are not AGW supporting climate scientists, for whom mistaken claims mean nothing. These are guys betting big bucks on the outcome of their analysis. I can assure you that they don’t futz around dividing the area up into rectangular boxes, and splitting the underground into 150 layers of varying thinknesses. They’d laugh at anyone who tried to estimate an ore body using such a klutzy method.
Instead, they use a mathematical method called “kriging“. Why do they use it? First, because it works.
Remember that the mining companies cannot afford mistakes. Kriging (and its variants) has been proven, time after time, to provide the best estimates of what cannot be measured under the surface.
Second, kriging provides actual error estimates, not the kind of “eight thousandths of a degree” nonsense promoted by the Argo analysts. The mining companies can’t delude themselves that they have more certainty than is warranted by the measurements. They need to know exactly what the risks are, not some overly optimistic calculation.
At the end of the day, I’d say throw out the existing analyses of the Argo data, along with all of the inflated claims of accuracy. Stop faffing about with gridboxes and layers, that’s high-school stuff. Get somebody who is an expert in kriging, and analyze the data properly. My guess is that a real analysis will show error intervals that render much of the estimates useless.
Anyhow, that’s my analysis of the Hansen Energy Imbalance paper. They claim an accuracy that I don’t think their hugely complex method can attain.
It’s a long post, likely inaccuracies and typos have crept in, be gentle …