I got into this investigation of Argo because I disbelieved their claimed error of 0.002°C for the annual average temperature of the top mile of the ocean. I discussed this in “Decimals of Precision“, where I showed that the error estimates were overly optimistic.
This is the next part of my slow wander through the Argo data. In How well can we derive Global Ocean Indicators from Argo data? (PDF, hereinafter SLT2011), K. von Schuckmann and P.-Y. Le Traon describe their method for analyzing the Argo data:
To evaluate GOIs [global oceanic indicators] from the irregularly distributed global Argo data, temperature and salinity profiles during the years 2005 to 2010 are uploaded spanning 10 to 1500m depth.
To estimate the GOIs from the irregularly distributed profiles, the global ocean is first divided into boxes of 5° latitude, 10° longitude and 3 month size. This provides a sufficient number of observations per box.
So I thought I’d take a look at some gridcell boxes. I’ve picked one which is typical, and shows some of the issues involved in determining a trend. Figure 1 shows the location of the temperature profiles for that gridbox, as well as showing the temperatures by latitude and day. The data in all cases is for the first three months of the year. The top row of Figure 1 shows all of the temperature for those three months (Jan-Feb-Mar) from all the years 2005-2011. The bottom row shows just the 2005 measurements. The following figures will show the other years.
Figure 1. Click on image for full size version. Gridcell is in the Atlantic, from 25°-30°N, and 30°-40°W. Left column shows the physical location of the samples within the gridbox. Colors in the left column are randomly assigned to different floats, one color per float. Right column shows the temperature by latitude. Small numbers above each sample show the day of the year that the sample was taken. Colors in the right column show the day the sample was taken, with red being day one of the year, shading through orange, yellow and green to end at blue at day 91. Top row shows all years. Bottom row shows 2005. Text in the right column gives the mean (average) of the temperature measurements, the standard deviation (StdDev), and the 95% confidence interval (95%CI) of the mean of the temperature data. The 95% CI is calculated as the standard error of the mean times 1.96.
Let’s consider the top row first. In the left column, we see the physical location of all samples that Argo floats took from 2005-2011. We have pretty good coverage of the area of the gridbox over that time. Note that the gridboxes are huge, half a million square kilometres for this particular one. So even with the 216 samples taken over the six-year period, that’s still only one sample per 2,500 square km.
Next, let’s consider the top right image. This shows how the temperatures vary by time and by latitude. As you would expect, the further north you go, the colder the ocean, with a swing of about three degrees from north to south.
In addition, you can see that the ocean is cooling from day 1 (start of January) to day 91 (end of March). The early records (red and orange) are on the right (warmer) side of the graph. The later records (green and blue) are concentrated in the left hand (cooler) side of the records.
This leads to a curious oddity. The spread (standard deviation) of the temperature records from any given float depends on the direction that the float is moving. If the float is moving south, it is moving into warmer waters, but the water generally is cooling, so the spread of temperatures is reduced. If the float is moving north, on the other hand, it is moving into cooler waters, and in addition the water is generally cooling, so the spread is increased. It is unclear what effect this will have on the results … but it won’t make them more accurate. You’d think that the directions of the floats might average out, but no such luck, south is more common than north in these months for this gridcell.
A second problem affecting the accuracy can be seen in the lower left graph of Figure 1. It seems that we have nine measurements … but they’re all located within one tiny part of the entire gridbox. This may or may not make a difference, depending on exactly where the measurements are located, and which direction the float is moving. We can see this in the upper row of Figure 2.
The effects I described above can be seen in the upper row, where the floats are in the northern half of the gridbox and moving generally southwards. There is a second effect visible, which is that one of the two floats (light blue circles) was only within the gridbox in the late (cooler) part of the period, with the first record being on day 62. As a result, the standard deviation of the measurements is small, and the temperature is anomalously low … which gives us a mean temperature of 20.8°C with a confidence interval of ± 0.36°C. In fact, the 95% confidence interval of the 2006 data does not overlap with the confidence interval of the mean of the entire 2005-2011 period (21.7° ± 0.12°C) … not a good sign at all
The 2007 data offers another problem … there weren’t any Argo floats at all in the gridcell for the entire three months. The authors say that in that case, they replace the year’s data with the “climatology”, which means the long-term average for the time period … but there’s a problem with that. The climatology covers the whole period, but there are more gaps in the first half of the record than in the latter half. As a result, if there is a trend in the data, this procedure is guaranteed to reduce that trend, by some unknown amount.
Figure 3 shows the next two years, 2008-2009.
2008 averages out very close to the overall average … but that’s just the luck of the draw, as the floats were split between the north and south. 2009 wasn’t so lucky, with most of the records in the south, This leads to a warmer average, as well as a small 95%CI.
Finally, Figure 4 shows 2010 and 2011.
In the final two years of the record, we are finally starting to get a more reasonable number of samples in the gridbox. However, there are still some interesting things going on. Look at the lower right graph. In the lower right of that graph there are two samples (day 71 and 81) from a float which didn’t move at all over that ten days (see bottom left graph, blue circles, with “81″ on top of “71″). In that ten days, the temperature in that one single location dropped by almost half a degree …
In this particular gridcell, the averages for each of the years 2005-2011 are 21.4°C, 20.8°C, no data, 21.7°C, 22°C, 21.9°C, and 21.7 °C. This gives a warming trend of 0.13°C/year, as shown in Figure 5.
My question is, how accurate is this trend? Me, I’d say we can’t trust it as far as we can throw it. The problem is that the early years (2005, ’06, and ’07) way undersample the gridcell, but this is hidden because they take a number of samples in one or two small areas. As a result, the confidence intervals are way understated, and the averages do not represent a valid sampling in either time or space.
My conclusion is that we simply do not have the data to say a whole lot about this gridcell. In particular, despite the apparent statistical accuracy of a trend calculated from from these numbers, I don’t think we can even say whether the gridcell is warming or cooling.
Finally, the law of large numbers is generally understood to relate to repeated measurements of the same thing. But here, two measurements ten days apart are half a degree different, while two measurements at the same time in different areas of the gridcell are as much as three degrees apart … are we measuring the “same thing” here or not? And if not, if we are measuring different things, what effect does that have on the uncertainty? Finally, all of these error calculations assume what is called “stationarity”, that is to say that the mean of the data doesn’t change over a sufficiently long time period. However, there is no reason to believe this is true. What does this do to the uncertainties?
I don’t have any answers to these questions, and looking at the data seems to only bring up more questions and complications. However, I had said that I doubted we knew the temperature to anything like the precision claimed by the authors. Table 1 of the SLT2011 paper claims a precision for the annual average heat content of the top mile of the ocean of ± 0.21e+8 Joules. Given the volume involved (414e+8 cubic kilometres), this means they are claiming to measure the temperature of the top mile of the ocean to ± 0.002°C, two thousandths of a degree …
As cited above, I showed before that this was unlikely by noting that there are on the order of 3500 Argo floats. If the SLT2011 numbers are correct and the error from 3500 floats is ± 0.002°C, it means that 35 floats could measure the temperature of the top mile of the ocean to a tenth of that accuracy, or ± two hundredths of a degree. This is highly unlikely, the ocean is way too large to be measured to plus or minus two hundredths of a degree by 35 floats.
Finally, people have the idea that the ocean is well-mixed, and changes slowly and gradually from one temperature to another. Nothing could be further from the truth. The predominant feature of the ocean is eddies. These eddies have a curious property. They can travel, carrying the same water, for hundreds and hundreds of miles. Here’s an article on one eddy that they have studied. Their illustration is shown as Figure 6.
Figure 7 shows another example of the eddying, non-uniform nature of the ocean. It is of the ocean off of the upper East Coast of the US, showing the Gulf Stream.
Figure 7. Oceanic temperature variation and eddies. Blue box is 5° latitude by 10° longitude. Temperature scale runs from blue (10°C, 50°F) to red (25°C, 77°F). SOURCE
The blue rectangle shows the size of the gridcell used in SLT2011. The red circles approximate the distribution within the gridbox of the measurements shown in the bottom row of Figure 1 for 2005. As you can see, this number and distribution of samples is way below the number and breadth of samples required to give us any kind of accuracy. Despite that, the strict statistical standard error of the mean would be very small, since there is little change in temperature in the immediate area. This gives an unwarranted and incorrect appearance of an accuracy of measurement that is simply not attainable by sampling a small area.
Why is this important? It is important because measuring the ocean temperature is part of determining the changes in the climate. My contention is that we still have far too little information to give us enough accuracy to say anything meaningful about “missing heat”.
Anyhow, that’s my latest wander through the Argo data. I find nothing to change my mind regarding what I see as greatly overstated precision for the temperature measurements.
My regards to everyone,